nips nips2005 nips2005-75 nips2005-75-reference knowledge-graph by maker-knowledge-mining

75 nips-2005-Fixing two weaknesses of the Spectral Method

Source: pdf

Author: Kevin Lang

Abstract: We discuss two intrinsic weaknesses of the spectral graph partitioning method, both of which have practical consequences. The ﬁrst is that spectral embeddings tend to hide the best cuts from the commonly used hyperplane rounding method. Rather than cleaning up the resulting suboptimal cuts with local search, we recommend the adoption of ﬂow-based rounding. The second weakness is that for many “power law” graphs, the spectral method produces cuts that are highly unbalanced, thus decreasing the usefulness of the method for visualization (see ﬁgure 4(b)) or as a basis for divide-and-conquer algorithms. These balance problems, which occur even though the spectral method’s quotient-style objective function does encourage balance, can be ﬁxed with a stricter balance constraint that turns the spectral mathematical program into an SDP that can be solved for million-node graphs by a method of Burer and Monteiro. 1 Background Graph partitioning is the NP-hard problem of ﬁnding a small graph cut subject to the constraint that neither side of the resulting partitioning of the nodes is “too small”. We will be dealing with several versions: the graph bisection problem, which requires perfect 1 : 1 2 2 balance; the β-balanced cut problem (with β a fraction such as 1 ), which requires at least 3 β : (1 − β) balance; and the quotient cut problem, which requires the small side to be large enough to “pay for” the edges in the cut. The quotient cut metric is c/ min(a, b), where c is the cutsize and a and b are the sizes of the two sides of the cut. All of the well-known variants of the quotient cut metric (e.g. normalized cut [15]) have similar behavior with respect to the issues discussed in this paper. The spectral method for graph partitioning was introduced in 1973 by Fiedler and Donath & Hoffman [6]. In the mid-1980’s Alon & Milman [1] proved that spectral cuts can be at worst quadratically bad; in the mid 1990’s Guattery & Miller [10] proved that this analysis is tight by exhibiting a family of n-node graphs whose spectral bisections cut O(n 2/3 ) edges versus the optimal O(n1/3 ) edges. On the other hand, Spielman & Teng [16] have proved stronger performance guarantees for the special case of spacelike graphs. The spectral method can be derived by relaxing a quadratic integer program which encodes the graph bisection problem (see section 3.1). The solution to this relaxation is the “Fiedler vector”, or second smallest eigenvector of the graph’s discrete Laplacian matrix, whose elements xi can be interpreted as an embedding of the graph on the line. To obtain a (A) Graph with nearly balanced 8-cut (B) Spectral Embedding (C) Notional Flow-based Embedding Figure 1: The spectral embedding hides the best solution from hyperplane rounding. speciﬁc cut, one must apply a “rounding method” to this embedding. The hyperplane rounding method chooses one of the n − 1 cuts which separate the nodes whose x i values lie above and below some split value x. ˆ 2 Using ﬂow to ﬁnd cuts that are hidden from hyperplane rounding Theorists have long known that the spectral method cannot distinguish between deep cuts and long paths, and that this confusion can cause it to cut a graph in the wrong direction thereby producing the spectral method’s worst-case behavior [10]. In this section we will show by example that even when the spectral method is not fooled into cutting in the wrong direction, the resulting embedding can hide the best cuts from the hyperplane rounding method. This is a possible explanation for the frequently made empirical observation (see e.g. [12]) that hyperplane roundings of spectral embeddings are noisy and therefore beneﬁt from cleanup with a local search method such as Fiduccia-Matheyses [8]. Consider the graph in ﬁgure 1(a), which has a near-bisection cutting 8 edges. For this graph the spectral method produces the embedding shown in ﬁgure 1(b), and recommends that we make a vertical cut (across the horizontal dimension which is based on the Fiedler vector). This is correct in a generalized sense, but it is obvious that no hyperplane (or vertical line in this picture) can possibly extract the optimal 8-edge cut. Some insight into why spectral embeddings tend to have this problem can be obtained from the spectral method’s electrical interpretation. In this view the graph is represented by a resistor network [7]. Current ﬂowing in this network causes voltage drops across the resistors, thus determining the nodes’ voltages and hence their positions. When current ﬂows through a long series of resistors, it induces a progressive voltage drop. This is what causes the excessive length of the embeddings of the horizontal girder-like structures which are blocking all vertical hyperplane cuts in ﬁgure 1(b). If the embedding method were somehow not based on current, but rather on ﬂow, which does not distinguish between a pipe and a series of pipes, then the long girders could retract into the two sides of the embedding, as suggested by ﬁgure 1(c), and the best cut would be revealed. Because theoretical ﬂow-like embedding methods such as [14] are currently not practical, we point out that in cases like ﬁgure 1(b), where the spectral method has not chosen an incorrect direction for the cut, one can use an S-T max ﬂow problem with the ﬂow running in the recommended direction (horizontally for this embedding) to extract the good cut even though it is hidden from all hyperplanes. We currently use two different ﬂow-based rounding methods. A method called MQI looks for quotient cuts, and is already described in [13]. Another method, that we shall call Midﬂow, looks for β-balanced cuts. The input to Midﬂow is a graph and an ordering of its nodes (obtained e.g. from a spectral embedding or from the projection of any embedding onto a line). We divide the graph’s nodes into 3 sets F, L, and U. The sets F and L respectively contain the ﬁrst βn and last βn nodes in the ordering, and U contains the remaining 50-50 balance ng s ro un di Hy pe r pl an e neg-pos split quotient cut score (cutsize / size of small side) 0.01 ctor r ve iedle of F 0.004 0.003 0.00268 0.00232 Best hyperplane rounding of Fiedler Vector Best improvement with local search 0.002 0.00138 0.001 60000 80000 Midflow rounding beta = 1/4 100000 120000 0.00145 140000 Midflow rounding of Fiedler Vector beta = 1/3 160000 180000 200000 220000 240000 number of nodes on ’left’ side of cut (out of 324800) Figure 2: A typical example (see section 2.1) where ﬂow-based rounding beats hyperplane rounding, even when the hyperplane cuts are improved with Fiduccia-Matheyses search. Note that for this spacelike graph, the best quotient cuts have reasonably good balance. U = n − 2βn nodes, which are “up for grabs”. We set up an S-T max ﬂow problem with one node for every graph node plus 2 new nodes for the source and sink. For each graph edge there are two arcs, one in each direction, with unit capacity. Finally, the nodes in F are pinned to the source and the nodes in L are pinned to sink by inﬁnite capacity arcs. This max-ﬂow problem can be solved by a good implementation of the push-relabel algorithm (such as Goldberg and Cherkassky’s hi pr [4]) in time that empirically is nearly linear with a very good constant factor. Figure 6 shows that solving a MidFlow problem with hi pr can be 1000 times cheaper than ﬁnding a spectral embedding with ARPACK. When the goal is ﬁnding good β-balanced cuts, MidFlow rounding is strictly more powerful than hyperplane rounding; from a given node ordering hyperplane rounding chooses the best of U + 1 candidate cuts, while MidFlow rounding chooses the best of 2U candidates, including all of those considered by hyperplane rounding. [Similarly, MQI rounding is strictly more powerful than hyperplane rounding for the task of ﬁnding good quotient cuts.] 2.1 A concrete example The plot in ﬁgure 2 shows a number of cuts in a 324,800 node nearly planar graph derived from a 700x464 pixel downward-looking view of some clouds over some mountains.1 The y-axis of the plot is quotient cut score; smaller values are better. We note in passing that the commonly used split point x = 0 does not yield the best hyperplane cut. Our main ˆ point is that the two cuts generated by MidFlow rounding of the Fiedler vector (with β = 1 3 and β = 1 ) are nearly twice as good as the best hyperplane cut. Even after the best 4 hyperplane cut has been improved by taking the best result of 100 runs of a version of Fiduccia-Matheyses local search, it is still much worse than the cuts obtained by ﬂowbased rounding. 1 The graph’s edges are unweighted but are chosen by a randomized rule which is more likely to include an edge between two neighboring pixels if they have a similar grey value. Good cuts in the graph tend to run along discontinuities in the image, as one would expect. quotient cut score 1 SDP-LB (smaller is better) 0.1 Scatter plot showing cuts in a

reference text

[1] N. Alon and V.D. Milman. λ1 , isoperimetric inequalities for graphs, and superconcentrators. Journal of Combinatorial Theory, Series B, 38:73–88, 1985. 100000 CK PA run time (seconds) 10000 lem 1000 rob np g lvin 100 e Eig h wit AR P ing SD SD lv So So s eti 10 hM it hw p 1 ng ti ec Bis 0.1 0.01 100 LR P- h wit 1000 gra r i_p hh low wit idF gM lvin So 10000 100000 1e+06 graph size (nodes + edges) 1e+07 Figure 6: Run time scaling on subsets of the Yahoo IM graph. Finding Spectral and SDP embeddings with ARPACK and SDP-LR requires about the same amount of time, while MidFlow rounding with hi pr is about 1000 times faster.

[2] Samuel Burer and Renato D.C. Monteiro. A nonlinear programming algorithm for solving semideﬁnite programs via low-rank factorization. Mathematical Programming (series B), 95(2):329–357, 2003.

[3] Samuel Burer and Renato D.C. Monteiro. Local minima and convergence in low-rank semideﬁnite programming. Technical report, Department of Management Sciences, University of Iowa, September 2003.

[4] Boris V. Cherkassky and Andrew V. Goldberg. On implementing the push-relabel method for the maximum ﬂow problem. Algorithmica, 19(4):390–410, 1997.

[5] F. Chung and L. Lu. Average distances in random graphs with given expected degree sequences. Proceedings of National Academy of Science, 99:15879–15882, 2002.

[6] W.E. Donath and A. J. Hoffman. Lower bounds for partitioning of graphs. IBM J. Res. Develop., 17:420–425, 1973.

[7] Peter G. Doyle and J. Laurie Snell. Random walks and electric networks, 1984. Mathematical Association of America; now available under the GPL.

[8] C.M. Fiduccia and R.M. Mattheyses. A linear time heuristic for improving network partitions. In Design Automation Conference, pages 175–181, 1982.

[9] Michel X. Goemans and David P. Williamson. Improved Approximation Algorithms for Maximum Cut and Satisﬁability Problems Using Semideﬁnite Programming. J. Assoc. Comput. Mach., 42:1115–1145, 1995.

[10] Stephen Guattery and Gary L. Miller. On the quality of spectral separators. SIAM Journal on Matrix Analysis and Applications, 19(3):701–719, 1998.

[11] C. Helmberg. Numerical evaluation of sbmethod. Math. Programming, 95(2):381–406, 2003.

[12] Bruce Hendrickson and Robert W. Leland. A multi-level algorithm for partitioning graphs. In Supercomputing, 1995.

[13] Kevin Lang and Satish Rao. A ﬂow-based method for improving the expansion or conductance of graph cuts. In Integer Programming and Combinatorial Optimization, pages 325–337, 2003.

[14] Umesh V. Vazirani Sanjeev Arora, Satish Rao. Expander ﬂows, geometric embeddings and graph partitioning. In STOC, pages 222–231, 2004.

[15] Jianbo Shi and Jitendra Malik. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8):888–905, 2000.

[16] Daniel A. Spielman and Shang-Hua Teng. Spectral partitioning works: Planar graphs and ﬁnite element meshes. In FOCS, pages 96–105, 1996.