교육데이터 활용•지원 서비스

로그인

데이터셋 상세

미국

Algorithms for Spectral Decomposition with Applications

The analysis of spectral signals for features that represent physical phenomenon is ubiquitous in the science and engineering communities. There are two main approaches that can be taken to extract relevant features from these high-dimensional data streams. The first set of approaches relies on extracting features using a physics-based paradigm where the underlying physical mechanism that generates the spectra is used to infer the most important features in the data stream. We focus on a complementary methodology that uses a data-driven technique that is informed by the underlying physics but also has the ability to adapt to unmodeled system attributes and dynamics. We discuss the following four algorithms: Spectral Decomposition Algorithm (SDA), Non-Negative Matrix Factorization (NMF), Independent Component Analysis (ICA) and Principal Components Analysis (PCA) and compare their performance on a spectral emulator which we use to generate artificial data with known statistical properties. This spectral emulator mimics the real-world phenomena arising from the plume of the space shuttle main engine and can be used to validate the results that arise from various spectral decomposition algorithms and is very useful for situations where real-world systems have very low probabilities of fault or failure. Our results indicate that methods like SDA and NMF provide a straightforward way of incorporating prior physical knowledge while NMF with a tuning mechanism can give superior performance on some tests. We demonstrate these algorithms to detect potential system-health issues on data from a spectral emulator with tunable health parameters.

데이터 정보

데이터 포털
미국
META URL
https://catalog.data.gov/dataset/algorithms-for-spectral-decomposition-with-applications
라이선스
notspecified
비용
제공기관
National Aeronautics and Space Administration
관리부서
데이터
- 랜딩 페이지
- Srivastava-_JANNAF_2008_OPAD.pdf

연관 데이터

Highly Scalable Matching Pursuit Signal Decomposition Algorithm

공공데이터포털

In this research, we propose a variant of the classical Matching Pursuit Decomposition (MPD) algorithm with significantly improved scalability and computational performance. MPD is a powerful iterative algorithm that decomposes a signal into linear combinations of its dictionary elements or “atoms”. A best fit atom from an arbitrarily defined dictionary is determined through cross-correlation. The selected atom is subtracted from the signal and this procedure is repeated on the residual in the subsequent iterations until a stopping criteria is met. A sufficiently large dictionary is required for an accurate reconstruction; this in return increases the computational burden of the algorithm, thus limiting its applicability and level of adoption. Our main contribution lies in improving the computational efficiency of the algorithm to allow faster decomposition while maintaining a similar level of accuracy. The Correlation Thresholding and Multiple Atom Extractions techniques were proposed to decrease the computational burden of the algorithm. Correlation thresholds prune insignificant atoms from the dictionary. The ability to extract multiple atoms within a single iteration enhanced the effectiveness and efficiency of each iteration. The proposed algorithm, entitled MPD++, was demonstrated using real world data set.

Virtual Sensors: Efficiently Estimating Missing Spectra

공공데이터포털

Various instruments are used to create images of the Earth and other objects in the universe in a diverse set of wavelength bands with the aim of understanding natural phenomena. Sometimes these instruments are built in a phased approach, with additional measurement capabilities added in later phases. In other cases, technology may mature to the point that the instrument offers new measurement capabilities that were not planned in the original design of the instrument. In still other cases, high resolution spectral measurements may be too costly to perform on a large sample and therefore lower resolution spectral instruments are used to take the majority of measurements. Many applied science questions that are relevant to the earth science remote sensing community require analysis of enormous amounts of data that were generated by instruments with disparate measurement capabilities. This paper addresses this problem using Virtual Sensors: a method that uses modelstrained on spectrally rich (high spectral resolution) data to "fill in" unmeasured spectral channels in spectrally poor (low spectral resolution) data. The models we use in this paper are Multi-Layer Perceptrons (MLPs), Support Vector Machines (SVMs) with Radial Basis Function (RBF) kernels and SVMs with Mixture Density Mercer Kernels (MDMK). We demonstrate this method by using models trained on the high spectral resolution Terra MODIS instrument to estimate what the equivalent of the MODIS 1.6 micron channel would be for the NOAA AVHRR/2 instrument. The scientific motivation for the simulation of the 1.6 micron channel is to improve the ability of the AVHRR/2 sensor to detect clouds over snow and ice.

Empirical Evaluation of Diagnostic Algorithm Performance Using a Generic Framework

공공데이터포털

A variety of rule-based, model-based and datadriven techniques have been proposed for detection and isolation of faults in physical systems. However, there have been few efforts to comparatively analyze the performance of these approaches on the same system under identical conditions. One reason for this was the lack of a standard framework to perform this comparison. In this paper we introduce a framework, called DXF, that provides a common language to represent the system description, sensor data and the fault diagnosis results; a run-time architecture to execute the diagnosis algorithms under identical conditions and collect the diagnosis results; and an evaluation component that can compute performance metrics from the diagnosis results to compare the algorithms. We have used DXF to perform an empirical evaluation of 13 diagnostic algorithms on a hardware testbed (ADAPT) at NASA Ames Research Center and on a set of synthetic circuits typically used as benchmarks in the model-based diagnosis community. Based on these empirical data we analyze the performance of each algorithm and suggest directions for future development.

Removing Spikes While Preserving Data and Noise using Wavelet Filter Banks

공공데이터포털

Many diagnostic datasets suffer from the adverse effects of spikes that are embedded in data and noise. For example, this is true for electrical power system data where the switches, relays, and inverters are major contributors to these effects. Spikes are mostly harmful to the analysis of data in that they throw off real-time detection of abnormal conditions, and classification of faults. Since noise and spikes are mixed together and embedded within the data, removal of the unwanted signals from the data is not always easy and may result in losing the integrity of the information carried by the data. Additionally, in some applications noise and spikes need to be filtered independently. The proposed algorithm is a multi-resolution filtering approach based on Haar wavelets that is capable of removing spikes while incurring insignificant damage to other data. In particular, noise in the data, which is a useful indicator that a sensor is healthy and not stuck, can be preserved using our approach. Presented here is the theoretical background with some examples from a realistic testbed.

Towards a Framework for Evaluating and Comparing Diagnosis Algorithms

공공데이터포털

Diagnostic inference involves the detection of anomalous system behavior and the identification of its cause, possibly down to a failed unit or to a parameter of a failed unit. Traditional approaches to solving this problem include expert/rule-based, model-based, and data-driven methods. Each approach (and various techniques within each approach) use different representations of the knowledge required to perform the diagnosis. The sensor data is expected to be combined with these internal representations to produce the diagnosis result. In spite of the availability of various diagnosis technologies, there have been only minimal efforts to develop a standardized software framework to run, evaluate, and compare different diagnosis technologies on the same system. This paper presents a framework that defines a standardized representation of the system knowledge, the sensor data, and the form of the diagnosis results – and provides a run-time architecture that can execute diagnosis algorithms, send sensor data to the algorithms at appropriate time steps from a variety of sources (including the actual physical system), and collect resulting diagnoses. We also define a set of metrics that can be used to evaluate and compare the performance of the algorithms, and provide software to calculate the metrics.

Classifying Things That Go Bang in the Night

공공데이터포털

The automated, real-time classification of variable and transient events in terms of their astrophysical nature is quickly becoming a necessity for the new synoptic sky surveys. This generally has to be done using sparse and heterogeneous measurements for individual events, both from the survey pipelines and existing archives. The data we used in our tests are both from archival observations, as well as from our own follow-up of recent transients from the PQ and CRTS surveys. See file for more information.

Anomaly Detection and Diagnosis Algorithms for Discrete Symbols

공공데이터포털

We present a set of novel algorithms which we call sequenceMiner that detect and characterize anomalies in large sets of high-dimensional symbol sequences that arise from recordings of switch sensors in the cockpits of commercial airliners. While the algorithms we present are general and domain-independent, we focus on a specific problem that is critical to determining the system-wide health of a fleet of aircraft. The approach taken uses unsupervised clustering of sequences using the normalized length of the longest common subsequence (nLCS) as a similarity measure, followed by detailed outlier analysis to detect anomalies. In this method, an outlier sequence is defined as a sequence that is far away from the cluster centre. We present new algorithms for outlier analysis that provide comprehensible indicators as to why a particular sequence is deemed to be an outlier. The algorithms provide a coherent description to an analyst of the anomalies in the sequence when compared to more normal sequences. In the final section of the paper we demonstrate the effectiveness of sequenceMiner for anomaly detection on a real set of discrete sequence data from a fleet of commercial airliners. We show that sequenceMiner discovers actionable and operationally significant safety events. We also compare our innovations with standard HiddenMarkov Models, and show that our methods are superior.

Key Real-World Applications of Classifier Ensembles

공공데이터포털

Broad classes of statistical classification algorithms have beendeveloped and applied successfully to a wide range of real worlddomains. In general, ensuring that the particular classificationalgorithm matches the properties of the data is crucial inproviding results that meet the needs of the particular applicationdomain. One way in which the impact of this algorithm/applicationmatch can be alleviated is by using ensembles of classifiers, wherea variety of classifiers (either different types of classifiers ordifferent instantiations of the same classifier) are pooled before afinal classification decision is made. Intuitively, classifierensembles allow the different needs of a difficult problem to behandled by classifiers suited to those particular needs.Mathematically, classifier ensembles provide an extra degree offreedom in the classical bias/variance tradeoff, allowing solutionsthat would be difficult (if not impossible) to reach with only asingle classifier. Because of these advantages, classifier ensembles have been applied to many difficult real world problems. In this paper, we surveyselect applications of ensemble methods to problems that havehistorically been most representative of the difficulties inclassification. In particular, we survey applications of ensemblemethods to remote sensing, person recognition, one vs. allrecognition, and medicine.

MESSENGER E/V/H MASCS 4 VIRS DERIVED DATA V2.0

공공데이터포털

Abstract ======== This data set consists of the MESSENGER MASCS VIRS derived observations, also known as DDRs. The MASCS VIRS experiment is a fixed concave grating spectrograph with a beam splitter that simultaneously disperses the spectrum onto two photodiode arrays. There are two VIRS DDR data products, one for each array, which result in coverage of the wavelength ranges of the visible (VIS) and near infrared (NIR). This is version 2 of this data set; version 1 is available on PDS volume MESSMAS_2001.

MESSENGER E/V/H MASCS 4 VIRS DERIVED DATA V2.0

공공데이터포털

Abstract ======== This data set consists of the MESSENGER MASCS VIRS derived observations, also known as DDRs. The MASCS VIRS experiment is a fixed concave grating spectrograph with a beam splitter that simultaneously disperses the spectrum onto two photodiode arrays. There are two VIRS DDR data products, one for each array, which result in coverage of the wavelength ranges of the visible (VIS) and near infrared (NIR). This is version 2 of this data set; version 1 is available on PDS volume MESSMAS_2001.

목록