jmlr jmlr2006 jmlr2006-69 knowledge-graph by maker-knowledge-mining

69 jmlr-2006-One-Class Novelty Detection for Seizure Analysis from Intracranial EEG


Source: pdf

Author: Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos, Brian Litt

Abstract: This paper describes an application of one-class support vector machine (SVM) novelty detection for detecting seizures in humans. Our technique maps intracranial electroencephalogram (EEG) time series into corresponding novelty sequences by classifying short-time, energy-based statistics computed from one-second windows of data. We train a classifier on epochs of interictal (normal) EEG. During ictal (seizure) epochs of EEG, seizure activity induces distributional changes in feature space that increase the empirical outlier fraction. A hypothesis test determines when the parameter change differs significantly from its nominal value, signaling a seizure detection event. Outputs are gated in a “one-shot” manner using persistence to reduce the false alarm rate of the system. The detector was validated using leave-one-out cross-validation (LOO-CV) on a sample of 41 interictal and 29 ictal epochs, and achieved 97.1% sensitivity, a mean detection latency of ©2005 Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos and Brian Litt GARDNER, KRIEGER, VACHTSEVANOS AND LITT -7.58 seconds, and an asymptotic false positive rate (FPR) of 1.56 false positives per hour (Fp/hr). These results are better than those obtained from a novelty detection technique based on Mahalanobis distance outlier detection, and comparable to the performance of a supervised learning technique used in experimental implantable devices (Echauz et al., 2001). The novelty detection paradigm overcomes three significant limitations of competing methods: the need to collect seizure data, precisely mark seizure onset and offset times, and perform patient-specific parameter tuning for detector training. Keywords: seizure detection, novelty detection, one-class SVM, epilepsy, unsupervised learning 1

Reference: text


Summary: the most important sentenses genereted by tfidf model

sentIndex sentText sentNum sentScore

1 We train a classifier on epochs of interictal (normal) EEG. [sent-13, score-0.349]

2 During ictal (seizure) epochs of EEG, seizure activity induces distributional changes in feature space that increase the empirical outlier fraction. [sent-14, score-1.128]

3 A hypothesis test determines when the parameter change differs significantly from its nominal value, signaling a seizure detection event. [sent-15, score-0.903]

4 The detector was validated using leave-one-out cross-validation (LOO-CV) on a sample of 41 interictal and 29 ictal epochs, and achieved 97. [sent-17, score-0.687]

5 1% sensitivity, a mean detection latency of ©2005 Andrew B. [sent-18, score-0.326]

6 These results are better than those obtained from a novelty detection technique based on Mahalanobis distance outlier detection, and comparable to the performance of a supervised learning technique used in experimental implantable devices (Echauz et al. [sent-23, score-0.503]

7 The novelty detection paradigm overcomes three significant limitations of competing methods: the need to collect seizure data, precisely mark seizure onset and offset times, and perform patient-specific parameter tuning for detector training. [sent-25, score-2.079]

8 Keywords: seizure detection, novelty detection, one-class SVM, epilepsy, unsupervised learning 1 Introduction Epilepsy, a neurological disorder in which patients suffer from recurring seizures, affects approximately 1% of the world population. [sent-26, score-0.959]

9 In spite of available dietary, drug, and surgical treatment options, more than 25% of individuals with epilepsy have seizures that are uncontrollable (Kandel, Schwartz & Jessel, 1991). [sent-30, score-0.292]

10 These treatments rely on robust algorithms for seizure detection to perform effectively. [sent-34, score-0.883]

11 Over the past 30 years seizure detection technology has matured. [sent-35, score-0.883]

12 This paper presents one technique for improving the state of the art in seizure detection by reformulating the task as a time-series novelty detection problem. [sent-39, score-1.283]

13 While seizure detection is traditionally considered a supervised learning problem (e. [sent-40, score-0.883]

14 , binary classification), an unsupervised approach allows for uniform treatment of seizure detection and prediction, and offers four key advantages for implementation. [sent-42, score-0.883]

15 Third, there is no need to collect seizure data for training. [sent-45, score-0.696]

16 Finally, there is no need to precisely mark seizure intervals. [sent-49, score-0.677]

17 For “properly chosen” features, novelties correspond well with the ictal events of interest, and our EEG time series are successfully segmented in a one-class-from-many manner. [sent-53, score-0.295]

18 2 Background In this section, we present a brief review of seizure-related terminology, the seizure detection literature, and the one-class SVM. [sent-54, score-0.883]

19 1 Seizure-Related Terminology Seizure analysis refers collectively to algorithms for seizure detection, seizure prediction, and automatic focus channel identification. [sent-56, score-1.37]

20 When multiple channels are considered, the electrode location that exhibits the earliest evidence of seizure activity is labeled the focus channel. [sent-59, score-0.729]

21 It is convenient to describe segments of the EEG signals by their temporal proximity to seizure activity. [sent-60, score-0.701]

22 The ictal period refers to the time during which a seizure occurs. [sent-61, score-0.947]

23 The interictal period is the time between successive seizures. [sent-62, score-0.262]

24 2 Seizure Detection Early attempts to detect seizures began in the 1970s (Viglione, Ordon & Risch, 1970; Liss, 1973) and primarily considered scalp EEG recordings to detect the clinical (and less frequently) electrographic onset of seizures. [sent-67, score-0.411]

25 In 1990, Gotman reported a technique for automated seizure detection that achieved 76% detection accuracy at 1 Fp/hr for 293 seizures recorded from 49 patients (Gotman, 1990). [sent-68, score-1.377]

26 Their detector achieved 100% detection accuracy on an 11-seizure database. [sent-70, score-0.393]

27 In 1995, Qu and Gotman presented an early seizure warning system trained on template EEG activity that achieved 100% detection accuracy at a mean detection latency of 9. [sent-71, score-1.258]

28 claimed 100% detection sensitivity with a mean detection latency of 2. [sent-76, score-0.561]

29 Several successful attempts at seizure detection using artificial neural network classifiers have been reported since 1996 (Khorasani & Weng, 1996; Webber et al. [sent-83, score-0.898]

30 Evaluation of 31 distinct features (Esteller, 2000) showed that fractal dimension, wavelet packet energy, and mean Teager energy were especially promising for seizure detection. [sent-85, score-0.716]

31 In 2001, Esteller reported a detector based on the line length feature that achieved a mean 1027 GARDNER, KRIEGER, VACHTSEVANOS AND LITT detection latency of 4. [sent-86, score-0.536]

32 , subsequently reported a similar detector based upon this work that achieved 97% sensitivity at a mean detection latency of 5. [sent-92, score-0.542]

33 The NeuroPace detector claims represent the state of the art in seizure detection performance. [sent-96, score-1.07]

34 More complete reviews of the seizure detection and prediction literature are available elsewhere (Litt & Echauz, 2002; Gardner, 2004). [sent-97, score-0.883]

35 3 Methodology In this section, we describe and discuss the experimental methods for detecting seizures under a novelty detection framework. [sent-140, score-0.6]

36 1029 GARDNER, KRIEGER, VACHTSEVANOS AND LITT model parameters Π = {ν , γ , p , N , T } persistence z [ n] preprocess feature extraction novelty detection parameter estimation IEEG Figure 2: The seizure analysis architecture. [sent-142, score-1.165]

37 1 Human Data Preparation The data analyzed were selected from intracranial EEG recordings of epilepsy patients implanted as part of standard evaluation for epilepsy surgery. [sent-145, score-0.349]

38 Five consecutive patients with seizures arising from the temporal lobe(s) were selected for review, and the corresponding data were expertly and independently marked by two certified epileptologists to indicate UEO and UCO times. [sent-150, score-0.312]

39 Ictal epochs were selected from the focus channel for each temporal lobe seizure that a patient exhibited. [sent-153, score-0.865]

40 Two patients exhibited some seizures with extra-temporal focal regions: those events were excluded from further analysis. [sent-154, score-0.306]

41 2 Feature Extraction Many features have been proposed for seizure analysis (Esteller, 2000; D’Alessandro, 2001; Esteller et al. [sent-160, score-0.677]

42 3 One-Class SVM Feature extraction was performed on interictal epochs to generate feature vectors for training. [sent-171, score-0.334]

43 1 was chosen consistent with the estimated fraction of ictal data. [sent-179, score-0.296]

44 4 Parameter Estimation For a stationary process, the one-class SVM novelty parameter, ν , asymptotically equals the outlier fraction. [sent-182, score-0.274]

45 We exploit this property by training on features which strongly discriminate interictal from ictal EEG: features are stationary during interictal periods, but change markedly during periods of seizure activity, causing significant changes in the empirical outlier fraction. [sent-183, score-1.55]

46 We assumed that ν = ν 0 for interictal EEG, and ν = ν 1 > ν 0 for ictal EEG. [sent-185, score-0.5]

47 We then used this estimate to compute a seizure event indicator variable, ˆ z [ k ] = sgn (ν − C ) (8) where z = +1 if a seizure is indicated or z = −1 otherwise, and C ∈ [ 0,1] is a threshold parameter. [sent-188, score-1.373]

48 5 Persistence (Detector Refractory Period) During early experiments we observed that the detector tended to generate novelty events (i. [sent-204, score-0.399]

49 , “fire”) in bursts, with increasing frequency near seizure onset. [sent-206, score-0.677]

50 This behavior may indicate the presence of preictal states, periods of EEG activity that are likely to transition from interictal to ictal state. [sent-207, score-0.555]

51 Persistence offers an improvement to the basic system beyond false positive rate improvement: it allows for the characterization of the detector over a range of detection time horizons. [sent-213, score-0.466]

52 As persistence decreases, one expects the false positive rate to increase and the detection latency to approach zero seconds. [sent-214, score-0.448]

53 Conversely, as persistence increases, one expects the false positive rate to decrease, asymptotically approaching a value determined jointly by the novelty parameters of the system (some fraction of the data will always be novel) and the actual novelty rate due to epileptiform activity. [sent-215, score-0.568]

54 We heuristically set the detector persistence to TR = 180 seconds for our experiments. [sent-217, score-0.285]

55 1032 ONE-CLASS NOVELTY DETECTION FOR SEIZURE ANALYSIS FROM IEEG TR Figure 3: Examples of persistence for improving detector false alarm performance. [sent-218, score-0.344]

56 (Top) Ictal epoch showing seizure activity (red, diagonal hatching). [sent-219, score-0.734]

57 Training was only performed using interictal epochs, however, testing was performed on each ictal segment, in addition to the withheld interictal epoch. [sent-224, score-0.746]

58 This scheme yields C ( N BL ,1) interictal- and C ( N BL ,1) × N SZ ictal statistics per patient, where N BL , N SZ are the patient-specific number of interictal and ictal epochs, respectively. [sent-225, score-0.754]

59 From these statistics we estimate three key performance metrics: sensitivity, false positive rate, and mean detection latency. [sent-226, score-0.295]

60 A block true positive occurs when the detector output, after applying persistence, correctly identifies an interval containing a seizure onset (c. [sent-228, score-0.962]

61 Block false negatives and false positives occur when the detector incorrectly labels interictal and ictal intervals, respectively. [sent-231, score-0.86]

62 1033 GARDNER, KRIEGER, VACHTSEVANOS AND LITT τ (i) τ (ii) T (iii) (iv) (v) (vi) Figure 4: Temporal relationships considered in detector evaluation: intervals representing detected novelty (blue, vertical hatching) and ictal activity (red, diagonal hatching). [sent-232, score-0.685]

63 Mean detection latency (11) measures detector responsiveness: µτ = 1 N N ∑τ i =1 i (11) where τ i is the detection latency of each detected seizure. [sent-242, score-0.828]

64 A negative latency indicates seizure event detection prior to the expert-labeled onset time. [sent-243, score-1.104]

65 7 Benchmark Novelty Detection To provide a reference for the relative performance of our algorithm, and the general application of unsupervised learning to the seizure detection problem, we implemented a simple benchmark novelty detection algorithm. [sent-245, score-1.306]

66 4 Results In this section we present the results of both seizure detection approaches. [sent-259, score-0.883]

67 76 a All false positives occurred on a single ictal epoch. [sent-317, score-0.354]

68 Several seizure onsets were originally mislabeled by as much as 110 seconds. [sent-318, score-0.677]

69 Bottom row of table summarizes aggregate We estimate the FPR over interictal EEG from the data in Table 1 by dividing FPBL by the epoch duration (0. [sent-323, score-0.274]

70 Since ictal events are rare, and the aggregate false positive rate on ictal segments is lower than the corresponding rate on interictal segments, we take the interictal FPR as an asymptotic measure of the overall FPR. [sent-327, score-1.091]

71 We reviewed the results for those patients (2, 4, and 5) with negative mean detection latencies. [sent-328, score-0.31]

72 For each of these patients we found that the distribution of detection latencies was skewed, and a fraction (less than one-third) of the models detected seizures early. [sent-329, score-0.577]

73 Representative IEEG time series, novelty sequences, and estimated outlier fractions for interictal- and ictal epochs are shown in Figures 5 and 6. [sent-337, score-0.593]

74 As expected, the outlier fraction remained near its (small) nominal value except during periods of seizure activity. [sent-338, score-0.845]

75 Onsets were detected quickly, and the entire seizure event—not just the onset— was correctly identified as novel. [sent-339, score-0.698]

76 The near-zero false negative rate (FNR) of the detector was surprising because the data used for training originated from unknown states of consciousness (e. [sent-340, score-0.26]

77 Typically, seizure detection performance is drastically affected by patient state-of-consciousness; evaluation on larger data sets with concomitant sleep staging information will provide a better estimate of the true FNR. [sent-343, score-0.943]

78 (Top) IEEG signal, (Middle) frame-wise output of the novelty detector, z , (Bottom) estimated outlier fraction (dashed line is 0. [sent-345, score-0.316]

79 The earliest electrographic change is visible as the beginning of the pinched region prior to the high-amplitude seizure onset. [sent-351, score-0.723]

80 The UEO occurs at time zero, (Middle) frame-wise output of the novelty detector, z , (Bottom) estimated outlier fraction and 0. [sent-352, score-0.316]

81 The detector has a latency of about 3 seconds in this example. [sent-354, score-0.324]

82 The SVM detector’s mean detection latency outperformed all previously reported seizure detection algorithms. [sent-355, score-1.209]

83 It should be noted, however, that this result is attributable to the large fraction of seizures (27%) that were detected early. [sent-356, score-0.263]

84 While they did not perform cross-validation, and optimized in-sample for each patient, their reported results—a 1038 ONE-CLASS NOVELTY DETECTION FOR SEIZURE ANALYSIS FROM IEEG mean detection latency of 5. [sent-362, score-0.326]

85 Both techniques are surprisingly effective at seizure detection, but the one-class SVM method performed consistently better, especially with respect to false positive rate. [sent-369, score-0.75]

86 Figure 7 shows the marginal distributions of features for both interictal and ictal data. [sent-371, score-0.5]

87 05 0 0 6 feature value Figure 7: 1 2 Representative marginals of the feature vector—E (solid blue), TE (dashed red), CL (dotted green)—for patient 5 corresponding to interictal (top) and ictal (bottom) frames. [sent-380, score-0.606]

88 1039 GARDNER, KRIEGER, VACHTSEVANOS AND LITT Figure 8: Representative isosurfaces in interictal feature space produced by each method. [sent-381, score-0.269]

89 Both approaches, SVM and Mahalanobis, find regions, S1 and S2 , in feature space that include 90% of the observations from interictal data. [sent-385, score-0.269]

90 2 Detector Output Analysis We analyzed a sample of 850 interictal detector outputs and confirmed that the empirical outlier fraction equaled its nominal value, 0. [sent-392, score-0.595]

91 5 for the probabilities of a novelty occurrence during ictal epochs. [sent-397, score-0.448]

92 Empirically, the conditional probability of a novel detector output given a previous novelty output increases dramatically from 0. [sent-405, score-0.381]

93 This analysis suggests that the detector output sequence obeys a Markov process where the probability at each point in time of a novelty is P ( zt = −1) = ν 0 = 0. [sent-408, score-0.4]

94 5 Conclusions Traditional approaches to seizure detection rely on binary classification. [sent-434, score-0.883]

95 They require seizure data for training, which is difficult and invasive to collect, and do not address the class imbalance problem between interictal and ictal EEG, as less than 1% of EEG data from epileptic patients is seizure-related. [sent-435, score-1.296]

96 These approaches assume that seizures develop in a consistent manner and seek to identify features and architectures that discriminate seizure EEG from “other” EEG. [sent-436, score-0.877]

97 In contrast, we have presented a technique for seizure detection based on novelty detection that operates by modeling the dominant data class, interictal EEG, and declaring outliers to this class as seizure events. [sent-437, score-2.206]

98 The success of our method relies on detecting change points in the empirical outlier fraction with respect to a feature space that strongly discriminates interictal from ictal EEG. [sent-438, score-0.645]

99 While the false positive performance of the detector is not as good as other reported algorithms, this may be attributable to the presence of subclinical seizures, or other nonictal anomalies in the data (e. [sent-441, score-0.26]

100 In this setting, the need to prevent seizures (avoid false negative events), and the apparent relative harmlessness of false positive stimulations, encourage making the detector hypersensitive. [sent-447, score-0.533]


similar papers computed by tfidf model

tfidf for this paper:

wordName wordTfidf (topN-words)

[('seizure', 0.677), ('ictal', 0.254), ('interictal', 0.246), ('detection', 0.206), ('seizures', 0.2), ('novelty', 0.194), ('detector', 0.187), ('eeg', 0.17), ('ieeg', 0.161), ('gardner', 0.123), ('litt', 0.123), ('latency', 0.104), ('krieger', 0.1), ('onset', 0.098), ('epilepsy', 0.092), ('vachtsevanos', 0.092), ('patients', 0.088), ('outlier', 0.08), ('esteller', 0.077), ('false', 0.073), ('epochs', 0.065), ('persistence', 0.065), ('gotman', 0.062), ('patient', 0.06), ('echauz', 0.054), ('intracranial', 0.054), ('svm', 0.052), ('fraction', 0.042), ('classification', 0.04), ('fpr', 0.039), ('bl', 0.038), ('neurology', 0.038), ('osorio', 0.038), ('ueo', 0.038), ('classifier', 0.038), ('clinical', 0.035), ('seconds', 0.033), ('qu', 0.033), ('mahalanobis', 0.033), ('atlanta', 0.031), ('epileptic', 0.031), ('fpbl', 0.031), ('refractory', 0.031), ('sz', 0.031), ('sensitivity', 0.029), ('activity', 0.029), ('detections', 0.029), ('epoch', 0.028), ('positives', 0.027), ('periods', 0.026), ('temporal', 0.024), ('earliest', 0.023), ('georgia', 0.023), ('feature', 0.023), ('electrodes', 0.023), ('electroencephalography', 0.023), ('electrographic', 0.023), ('epilepsia', 0.023), ('fpsz', 0.023), ('frei', 0.023), ('hatching', 0.023), ('implantable', 0.023), ('lobe', 0.023), ('mfp', 0.023), ('neuropace', 0.023), ('novelties', 0.023), ('recordings', 0.023), ('stimulation', 0.023), ('teager', 0.023), ('uco', 0.023), ('electrical', 0.023), ('benchmark', 0.023), ('energy', 0.023), ('significant', 0.021), ('detected', 0.021), ('nominal', 0.02), ('iid', 0.02), ('outputs', 0.02), ('warning', 0.02), ('efficient', 0.02), ('therapy', 0.02), ('dissertation', 0.02), ('te', 0.02), ('latencies', 0.02), ('zt', 0.019), ('alarm', 0.019), ('collect', 0.019), ('event', 0.019), ('producing', 0.018), ('events', 0.018), ('tp', 0.018), ('detect', 0.016), ('mean', 0.016), ('channel', 0.016), ('period', 0.016), ('five', 0.016), ('device', 0.016), ('fn', 0.016), ('abba', 0.015), ('artificial', 0.015)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 1.0000008 69 jmlr-2006-One-Class Novelty Detection for Seizure Analysis from Intracranial EEG

Author: Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos, Brian Litt

Abstract: This paper describes an application of one-class support vector machine (SVM) novelty detection for detecting seizures in humans. Our technique maps intracranial electroencephalogram (EEG) time series into corresponding novelty sequences by classifying short-time, energy-based statistics computed from one-second windows of data. We train a classifier on epochs of interictal (normal) EEG. During ictal (seizure) epochs of EEG, seizure activity induces distributional changes in feature space that increase the empirical outlier fraction. A hypothesis test determines when the parameter change differs significantly from its nominal value, signaling a seizure detection event. Outputs are gated in a “one-shot” manner using persistence to reduce the false alarm rate of the system. The detector was validated using leave-one-out cross-validation (LOO-CV) on a sample of 41 interictal and 29 ictal epochs, and achieved 97.1% sensitivity, a mean detection latency of ©2005 Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos and Brian Litt GARDNER, KRIEGER, VACHTSEVANOS AND LITT -7.58 seconds, and an asymptotic false positive rate (FPR) of 1.56 false positives per hour (Fp/hr). These results are better than those obtained from a novelty detection technique based on Mahalanobis distance outlier detection, and comparable to the performance of a supervised learning technique used in experimental implantable devices (Echauz et al., 2001). The novelty detection paradigm overcomes three significant limitations of competing methods: the need to collect seizure data, precisely mark seizure onset and offset times, and perform patient-specific parameter tuning for detector training. Keywords: seizure detection, novelty detection, one-class SVM, epilepsy, unsupervised learning 1

2 0.074632525 3 jmlr-2006-A Hierarchy of Support Vector Machines for Pattern Detection

Author: Hichem Sahbi, Donald Geman

Abstract: We introduce a computational design for pattern detection based on a tree-structured network of support vector machines (SVMs). An SVM is associated with each cell in a recursive partitioning of the space of patterns (hypotheses) into increasingly finer subsets. The hierarchy is traversed coarse-to-fine and each chain of positive responses from the root to a leaf constitutes a detection. Our objective is to design and build a network which balances overall error and computation. Initially, SVMs are constructed for each cell with no constraints. This “free network” is then perturbed, cell by cell, into another network, which is “graded” in two ways: first, the number of support vectors of each SVM is reduced (by clustering) in order to adjust to a pre-determined, increasing function of cell depth; second, the decision boundaries are shifted to preserve all positive responses from the original set of training data. The limits on the numbers of clusters (virtual support vectors) result from minimizing the mean computational cost of collecting all detections subject to a bound on the expected number of false positives. When applied to detecting faces in cluttered scenes, the patterns correspond to poses and the free network is already faster and more accurate than applying a single pose-specific SVM many times. The graded network promotes very rapid processing of background regions while maintaining the discriminatory power of the free network. Keywords: statistical learning, hierarchy of classifiers, coarse-to-fine computation, support vector machines, face detection

3 0.054993711 97 jmlr-2006- (Special Topic on Machine Learning for Computer Security)

Author: Charles V. Wright, Fabian Monrose, Gerald M. Masson

Abstract: Several fundamental security mechanisms for restricting access to network resources rely on the ability of a reference monitor to inspect the contents of traffic as it traverses the network. However, with the increasing popularity of cryptographic protocols, the traditional means of inspecting packet contents to enforce security policies is no longer a viable approach as message contents are concealed by encryption. In this paper, we investigate the extent to which common application protocols can be identified using only the features that remain intact after encryption—namely packet size, timing, and direction. We first present what we believe to be the first exploratory look at protocol identification in encrypted tunnels which carry traffic from many TCP connections simultaneously, using only post-encryption observable features. We then explore the problem of protocol identification in individual encrypted TCP connections, using much less data than in other recent approaches. The results of our evaluation show that our classifiers achieve accuracy greater than 90% for several protocols in aggregate traffic, and, for most protocols, greater than 80% when making fine-grained classifications on single connections. Moreover, perhaps most surprisingly, we show that one can even estimate the number of live connections in certain classes of encrypted tunnels to within, on average, better than 20%. Keywords: traffic classification, hidden Markov models, network security

4 0.047386181 38 jmlr-2006-Incremental Support Vector Learning: Analysis, Implementation and Applications     (Special Topic on Machine Learning and Optimization)

Author: Pavel Laskov, Christian Gehl, Stefan Krüger, Klaus-Robert Müller

Abstract: Incremental Support Vector Machines (SVM) are instrumental in practical applications of online learning. This work focuses on the design and analysis of efficient incremental SVM learning, with the aim of providing a fast, numerically stable and robust implementation. A detailed analysis of convergence and of algorithmic complexity of incremental SVM learning is carried out. Based on this analysis, a new design of storage and numerical operations is proposed, which speeds up the training of an incremental SVM by a factor of 5 to 20. The performance of the new algorithm is demonstrated in two scenarios: learning with limited resources and active learning. Various applications of the algorithm, such as in drug discovery, online monitoring of industrial devices and and surveillance of network traffic, can be foreseen. Keywords: incremental SVM, online learning, drug discovery, intrusion detection

5 0.024029331 59 jmlr-2006-Machine Learning for Computer Security    (Special Topic on Machine Learning for Computer Security)

Author: Philip K. Chan, Richard P. Lippmann

Abstract: The prevalent use of computers and internet has enhanced the quality of life for many people, but it has also attracted undesired attempts to undermine these systems. This special topic contains several research studies on how machine learning algorithms can help improve the security of computer systems. Keywords: computer security, spam, images with embedded text, malicious executables, network protocols, encrypted traffic

6 0.0232797 86 jmlr-2006-Step Size Adaptation in Reproducing Kernel Hilbert Space

7 0.019549653 91 jmlr-2006-The Interplay of Optimization and Machine Learning Research     (Special Topic on Machine Learning and Optimization)

8 0.018033454 79 jmlr-2006-Second Order Cone Programming Approaches for Handling Missing and Uncertain Data     (Special Topic on Machine Learning and Optimization)

9 0.017349068 18 jmlr-2006-Building Support Vector Machines with Reduced Classifier Complexity     (Special Topic on Machine Learning and Optimization)

10 0.016389085 88 jmlr-2006-Streamwise Feature Selection

11 0.016165838 15 jmlr-2006-Bayesian Network Learning with Parameter Constraints     (Special Topic on Machine Learning and Optimization)

12 0.015923049 60 jmlr-2006-Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

13 0.0158868 52 jmlr-2006-Learning Spectral Clustering, With Application To Speech Separation

14 0.015639005 57 jmlr-2006-Linear State-Space Models for Blind Source Separation

15 0.014504707 12 jmlr-2006-Active Learning with Feedback on Features and Instances

16 0.014484707 13 jmlr-2006-Adaptive Prototype Learning Algorithms: Theoretical and Experimental Studies

17 0.014167499 65 jmlr-2006-Nonparametric Quantile Estimation

18 0.013942054 37 jmlr-2006-Incremental Algorithms for Hierarchical Classification

19 0.013673181 43 jmlr-2006-Large Scale Multiple Kernel Learning     (Special Topic on Machine Learning and Optimization)

20 0.013242706 17 jmlr-2006-Bounds for the Loss in Probability of Correct Classification Under Model Based Approximation


similar papers computed by lsi model

lsi for this paper:

topicId topicWeight

[(0, 0.081), (1, -0.023), (2, 0.027), (3, 0.059), (4, -0.001), (5, -0.008), (6, -0.105), (7, -0.174), (8, 0.012), (9, -0.006), (10, -0.102), (11, -0.056), (12, 0.032), (13, -0.039), (14, 0.047), (15, -0.056), (16, 0.053), (17, 0.007), (18, -0.005), (19, 0.12), (20, -0.017), (21, -0.084), (22, -0.025), (23, 0.043), (24, 0.204), (25, 0.049), (26, -0.047), (27, 0.061), (28, 0.235), (29, 0.081), (30, -0.055), (31, 0.158), (32, 0.406), (33, 0.099), (34, -0.015), (35, 0.12), (36, 0.122), (37, -0.117), (38, 0.041), (39, 0.142), (40, -0.112), (41, -0.371), (42, 0.115), (43, 0.083), (44, -0.005), (45, 0.206), (46, -0.041), (47, 0.014), (48, 0.012), (49, -0.227)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.97670883 69 jmlr-2006-One-Class Novelty Detection for Seizure Analysis from Intracranial EEG

Author: Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos, Brian Litt

Abstract: This paper describes an application of one-class support vector machine (SVM) novelty detection for detecting seizures in humans. Our technique maps intracranial electroencephalogram (EEG) time series into corresponding novelty sequences by classifying short-time, energy-based statistics computed from one-second windows of data. We train a classifier on epochs of interictal (normal) EEG. During ictal (seizure) epochs of EEG, seizure activity induces distributional changes in feature space that increase the empirical outlier fraction. A hypothesis test determines when the parameter change differs significantly from its nominal value, signaling a seizure detection event. Outputs are gated in a “one-shot” manner using persistence to reduce the false alarm rate of the system. The detector was validated using leave-one-out cross-validation (LOO-CV) on a sample of 41 interictal and 29 ictal epochs, and achieved 97.1% sensitivity, a mean detection latency of ©2005 Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos and Brian Litt GARDNER, KRIEGER, VACHTSEVANOS AND LITT -7.58 seconds, and an asymptotic false positive rate (FPR) of 1.56 false positives per hour (Fp/hr). These results are better than those obtained from a novelty detection technique based on Mahalanobis distance outlier detection, and comparable to the performance of a supervised learning technique used in experimental implantable devices (Echauz et al., 2001). The novelty detection paradigm overcomes three significant limitations of competing methods: the need to collect seizure data, precisely mark seizure onset and offset times, and perform patient-specific parameter tuning for detector training. Keywords: seizure detection, novelty detection, one-class SVM, epilepsy, unsupervised learning 1

2 0.4276481 3 jmlr-2006-A Hierarchy of Support Vector Machines for Pattern Detection

Author: Hichem Sahbi, Donald Geman

Abstract: We introduce a computational design for pattern detection based on a tree-structured network of support vector machines (SVMs). An SVM is associated with each cell in a recursive partitioning of the space of patterns (hypotheses) into increasingly finer subsets. The hierarchy is traversed coarse-to-fine and each chain of positive responses from the root to a leaf constitutes a detection. Our objective is to design and build a network which balances overall error and computation. Initially, SVMs are constructed for each cell with no constraints. This “free network” is then perturbed, cell by cell, into another network, which is “graded” in two ways: first, the number of support vectors of each SVM is reduced (by clustering) in order to adjust to a pre-determined, increasing function of cell depth; second, the decision boundaries are shifted to preserve all positive responses from the original set of training data. The limits on the numbers of clusters (virtual support vectors) result from minimizing the mean computational cost of collecting all detections subject to a bound on the expected number of false positives. When applied to detecting faces in cluttered scenes, the patterns correspond to poses and the free network is already faster and more accurate than applying a single pose-specific SVM many times. The graded network promotes very rapid processing of background regions while maintaining the discriminatory power of the free network. Keywords: statistical learning, hierarchy of classifiers, coarse-to-fine computation, support vector machines, face detection

3 0.23849991 38 jmlr-2006-Incremental Support Vector Learning: Analysis, Implementation and Applications     (Special Topic on Machine Learning and Optimization)

Author: Pavel Laskov, Christian Gehl, Stefan Krüger, Klaus-Robert Müller

Abstract: Incremental Support Vector Machines (SVM) are instrumental in practical applications of online learning. This work focuses on the design and analysis of efficient incremental SVM learning, with the aim of providing a fast, numerically stable and robust implementation. A detailed analysis of convergence and of algorithmic complexity of incremental SVM learning is carried out. Based on this analysis, a new design of storage and numerical operations is proposed, which speeds up the training of an incremental SVM by a factor of 5 to 20. The performance of the new algorithm is demonstrated in two scenarios: learning with limited resources and active learning. Various applications of the algorithm, such as in drug discovery, online monitoring of industrial devices and and surveillance of network traffic, can be foreseen. Keywords: incremental SVM, online learning, drug discovery, intrusion detection

4 0.20034996 97 jmlr-2006- (Special Topic on Machine Learning for Computer Security)

Author: Charles V. Wright, Fabian Monrose, Gerald M. Masson

Abstract: Several fundamental security mechanisms for restricting access to network resources rely on the ability of a reference monitor to inspect the contents of traffic as it traverses the network. However, with the increasing popularity of cryptographic protocols, the traditional means of inspecting packet contents to enforce security policies is no longer a viable approach as message contents are concealed by encryption. In this paper, we investigate the extent to which common application protocols can be identified using only the features that remain intact after encryption—namely packet size, timing, and direction. We first present what we believe to be the first exploratory look at protocol identification in encrypted tunnels which carry traffic from many TCP connections simultaneously, using only post-encryption observable features. We then explore the problem of protocol identification in individual encrypted TCP connections, using much less data than in other recent approaches. The results of our evaluation show that our classifiers achieve accuracy greater than 90% for several protocols in aggregate traffic, and, for most protocols, greater than 80% when making fine-grained classifications on single connections. Moreover, perhaps most surprisingly, we show that one can even estimate the number of live connections in certain classes of encrypted tunnels to within, on average, better than 20%. Keywords: traffic classification, hidden Markov models, network security

5 0.13909164 1 jmlr-2006-A Direct Method for Building Sparse Kernel Learning Algorithms

Author: Mingrui Wu, Bernhard Schölkopf, Gökhan Bakır

Abstract: Many kernel learning algorithms, including support vector machines, result in a kernel machine, such as a kernel classifier, whose key component is a weight vector in a feature space implicitly introduced by a positive definite kernel function. This weight vector is usually obtained by solving a convex optimization problem. Based on this fact we present a direct method to build sparse kernel learning algorithms by adding one more constraint to the original convex optimization problem, such that the sparseness of the resulting kernel machine is explicitly controlled while at the same time performance is kept as high as possible. A gradient based approach is provided to solve this modified optimization problem. Applying this method to the support vectom machine results in a concrete algorithm for building sparse large margin classifiers. These classifiers essentially find a discriminating subspace that can be spanned by a small number of vectors, and in this subspace, the different classes of data are linearly well separated. Experimental results over several classification benchmarks demonstrate the effectiveness of our approach. Keywords: sparse learning, sparse large margin classifiers, kernel learning algorithms, support vector machine, kernel Fisher discriminant

6 0.12752698 88 jmlr-2006-Streamwise Feature Selection

7 0.12660974 4 jmlr-2006-A Linear Non-Gaussian Acyclic Model for Causal Discovery

8 0.11914607 57 jmlr-2006-Linear State-Space Models for Blind Source Separation

9 0.10987862 62 jmlr-2006-MinReg: A Scalable Algorithm for Learning Parsimonious Regulatory Networks in Yeast and Mammals

10 0.088191956 55 jmlr-2006-Linear Programming Relaxations and Belief Propagation -- An Empirical Study     (Special Topic on Machine Learning and Optimization)

11 0.086009644 43 jmlr-2006-Large Scale Multiple Kernel Learning     (Special Topic on Machine Learning and Optimization)

12 0.085572824 86 jmlr-2006-Step Size Adaptation in Reproducing Kernel Hilbert Space

13 0.085179031 11 jmlr-2006-Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error

14 0.081431732 13 jmlr-2006-Adaptive Prototype Learning Algorithms: Theoretical and Experimental Studies

15 0.08064419 15 jmlr-2006-Bayesian Network Learning with Parameter Constraints     (Special Topic on Machine Learning and Optimization)

16 0.080005802 89 jmlr-2006-Structured Prediction, Dual Extragradient and Bregman Projections     (Special Topic on Machine Learning and Optimization)

17 0.071619906 71 jmlr-2006-Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis

18 0.069775149 63 jmlr-2006-New Algorithms for Efficient High-Dimensional Nonparametric Classification

19 0.06203904 66 jmlr-2006-On Model Selection Consistency of Lasso

20 0.059602026 28 jmlr-2006-Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting


similar papers computed by lda model

lda for this paper:

topicId topicWeight

[(8, 0.014), (36, 0.051), (45, 0.021), (50, 0.018), (61, 0.03), (63, 0.045), (64, 0.526), (68, 0.014), (70, 0.012), (76, 0.01), (78, 0.018), (81, 0.028), (84, 0.015), (90, 0.018), (91, 0.018), (96, 0.065)]

similar papers list:

simIndex simValue paperId paperTitle

same-paper 1 0.85622871 69 jmlr-2006-One-Class Novelty Detection for Seizure Analysis from Intracranial EEG

Author: Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos, Brian Litt

Abstract: This paper describes an application of one-class support vector machine (SVM) novelty detection for detecting seizures in humans. Our technique maps intracranial electroencephalogram (EEG) time series into corresponding novelty sequences by classifying short-time, energy-based statistics computed from one-second windows of data. We train a classifier on epochs of interictal (normal) EEG. During ictal (seizure) epochs of EEG, seizure activity induces distributional changes in feature space that increase the empirical outlier fraction. A hypothesis test determines when the parameter change differs significantly from its nominal value, signaling a seizure detection event. Outputs are gated in a “one-shot” manner using persistence to reduce the false alarm rate of the system. The detector was validated using leave-one-out cross-validation (LOO-CV) on a sample of 41 interictal and 29 ictal epochs, and achieved 97.1% sensitivity, a mean detection latency of ©2005 Andrew B. Gardner, Abba M. Krieger, George Vachtsevanos and Brian Litt GARDNER, KRIEGER, VACHTSEVANOS AND LITT -7.58 seconds, and an asymptotic false positive rate (FPR) of 1.56 false positives per hour (Fp/hr). These results are better than those obtained from a novelty detection technique based on Mahalanobis distance outlier detection, and comparable to the performance of a supervised learning technique used in experimental implantable devices (Echauz et al., 2001). The novelty detection paradigm overcomes three significant limitations of competing methods: the need to collect seizure data, precisely mark seizure onset and offset times, and perform patient-specific parameter tuning for detector training. Keywords: seizure detection, novelty detection, one-class SVM, epilepsy, unsupervised learning 1

2 0.3756724 5 jmlr-2006-A Robust Procedure For Gaussian Graphical Model Search From Microarray Data WithpLarger Thann

Author: Robert Castelo, Alberto Roverato

Abstract: Learning of large-scale networks of interactions from microarray data is an important and challenging problem in bioinformatics. A widely used approach is to assume that the available data constitute a random sample from a multivariate distribution belonging to a Gaussian graphical model. As a consequence, the prime objects of inference are full-order partial correlations which are partial correlations between two variables given the remaining ones. In the context of microarray data the number of variables exceed the sample size and this precludes the application of traditional structure learning procedures because a sampling version of full-order partial correlations does not exist. In this paper we consider limited-order partial correlations, these are partial correlations computed on marginal distributions of manageable size, and provide a set of rules that allow one to assess the usefulness of these quantities to derive the independence structure of the underlying Gaussian graphical model. Furthermore, we introduce a novel structure learning procedure based on a quantity, obtained from limited-order partial correlations, that we call the non-rejection rate. The applicability and usefulness of the procedure are demonstrated by both simulated and real data. Keywords: Gaussian distribution, gene network, graphical model, microarray data, non-rejection rate, partial correlation, small-sample inference

3 0.19262519 41 jmlr-2006-Kernel-Based Learning of Hierarchical Multilabel Classification Models     (Special Topic on Machine Learning and Optimization)

Author: Juho Rousu, Craig Saunders, Sandor Szedmak, John Shawe-Taylor

Abstract: We present a kernel-based algorithm for hierarchical text classification where the documents are allowed to belong to more than one category at a time. The classification model is a variant of the Maximum Margin Markov Network framework, where the classification hierarchy is represented as a Markov tree equipped with an exponential family defined on the edges. We present an efficient optimization algorithm based on incremental conditional gradient ascent in single-example subspaces spanned by the marginal dual variables. The optimization is facilitated with a dynamic programming based algorithm that computes best update directions in the feasible set. Experiments show that the algorithm can feasibly optimize training sets of thousands of examples and classification hierarchies consisting of hundreds of nodes. Training of the full hierarchical model is as efficient as training independent SVM-light classifiers for each node. The algorithm’s predictive accuracy was found to be competitive with other recently introduced hierarchical multicategory or multilabel classification learning algorithms. Keywords: kernel methods, hierarchical classification, text categorization, convex optimization, structured outputs

4 0.18912464 52 jmlr-2006-Learning Spectral Clustering, With Application To Speech Separation

Author: Francis R. Bach, Michael I. Jordan

Abstract: Spectral clustering refers to a class of techniques which rely on the eigenstructure of a similarity matrix to partition points into disjoint clusters, with points in the same cluster having high similarity and points in different clusters having low similarity. In this paper, we derive new cost functions for spectral clustering based on measures of error between a given partition and a solution of the spectral relaxation of a minimum normalized cut problem. Minimizing these cost functions with respect to the partition leads to new spectral clustering algorithms. Minimizing with respect to the similarity matrix leads to algorithms for learning the similarity matrix from fully labelled data sets. We apply our learning algorithm to the blind one-microphone speech separation problem, casting the problem as one of segmentation of the spectrogram. Keywords: spectral clustering, blind source separation, computational auditory scene analysis

5 0.18569721 44 jmlr-2006-Large Scale Transductive SVMs

Author: Ronan Collobert, Fabian Sinz, Jason Weston, Léon Bottou

Abstract: We show how the concave-convex procedure can be applied to transductive SVMs, which traditionally require solving a combinatorial search problem. This provides for the first time a highly scalable algorithm in the nonlinear case. Detailed experiments verify the utility of our approach. Software is available at http://www.kyb.tuebingen.mpg.de/bs/people/fabee/transduction. html. Keywords: transduction, transductive SVMs, semi-supervised learning, CCCP

6 0.18282729 61 jmlr-2006-Maximum-Gain Working Set Selection for SVMs     (Special Topic on Machine Learning and Optimization)

7 0.18279839 3 jmlr-2006-A Hierarchy of Support Vector Machines for Pattern Detection

8 0.18192837 51 jmlr-2006-Learning Sparse Representations by Non-Negative Matrix Factorization and Sequential Cone Programming     (Special Topic on Machine Learning and Optimization)

9 0.18164445 94 jmlr-2006-Using Machine Learning to Guide Architecture Simulation

10 0.18141741 53 jmlr-2006-Learning a Hidden Hypergraph

11 0.18016556 14 jmlr-2006-An Efficient Implementation of an Active Set Method for SVMs    (Special Topic on Machine Learning and Optimization)

12 0.17961505 1 jmlr-2006-A Direct Method for Building Sparse Kernel Learning Algorithms

13 0.17740154 56 jmlr-2006-Linear Programs for Hypotheses Selection in Probabilistic Inference Models     (Special Topic on Machine Learning and Optimization)

14 0.17710194 72 jmlr-2006-Parallel Software for Training Large Scale Support Vector Machines on Multiprocessor Systems     (Special Topic on Machine Learning and Optimization)

15 0.17669347 60 jmlr-2006-Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

16 0.17653242 25 jmlr-2006-Distance Patterns in Structural Similarity

17 0.17571692 11 jmlr-2006-Active Learning in Approximately Linear Regression Based on Conditional Expectation of Generalization Error

18 0.17509162 49 jmlr-2006-Learning Parts-Based Representations of Data

19 0.17490906 89 jmlr-2006-Structured Prediction, Dual Extragradient and Bregman Projections     (Special Topic on Machine Learning and Optimization)

20 0.17398654 28 jmlr-2006-Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting