AWM Academic Papers

Andrew speaking at CMU in April 2014 Andrew W. Moore

Academic Papers

Theoretical Justification of Popular Link Prediction Heuristics (COLT 2010)
Fast Nearest-neighbor Search in Disk-resident Graphs (KDD 2010)
Fast Dynamic Reranking in Large Graphs (2009)
Efficient Algorithms for Large-Scale Asteroid Discovery (2008)
Forecasting Web Page Views: Methods and Observations (JMLR 2008)
In order to book display advertisements in advance, you need to estimate the new number of views in advance.
Fast Incremental Proximity Search in Large Graphs (2008)
Slightly revised from the ICML camera-ready version
Fast State Discovery for HMM Model Selection and Learning (2007)
A Tractable Approach to Finding Closest Truncated-commute-time Neighbors in Large Graphs (2007)
Handbook of Biosurveillance, (Elsevier Books, 2006)
Sequential Update of ADtrees (2006)
Dependency Trees in Sub-linear Time and Bounded Memory (2006)
Efficient learning of dependency trees for huge datasets.
Monitoring Food Safety by Detecting Patterns in Consumer Complaints (2006)
Autonomous Visualization (2006)
Efficient Analytics for Effective Monitoring of Biomedical Security (2005)
Making Logistic Regression A Core Data Mining Tool With TR-IRLS (2005)
This short paper is the easiest, fastest way to learn about Truncated Regularized Iteratively Re-weighted Least Squares (TR-IRLS), my algorithm for fast, parameter-free logistic regression. TR-IRLS can also be used for any generalized linear model. This
A Multiple Tree Algorithm for the Efficient Association of Asteroid Observations (2005)
Alias Detection in Link Data Sets (2005)
Combining string similarity with contextual similarity when searching for aliases using active learning.
Making Logistic Regression A Core Data Mining Tool: A Practical Investigation of Accuracy, Speed, and Simplicity (2005)
Regularized logistic regression can be fast, accurate, and simple. This paper includes the most important findings of my thesis, and a few new details.
Fast Inference and Learning in Large-State-Space HMMs (2005)
Efficiently Identifying Close Track/Observation Pairs in Continuous Timed Data (2005)
Detecting Significant Multidimensional Spatial Clusters (2005)
Applying the fast multidimensional spatial scan statistic to detect clusters in epidemiological and brain imaging data.
A Bayesian scan statistic for spatial cluster detection (2005)
A new Bayesian method for cluster detection
Dynamic Social Network Analysis using Latent Space Models (2005)
A Bayesian spatial scan statistic (2005)
A new Bayesian method for spatial cluster detection
Detecting Anomalous Patterns in Pharmacy Retail Data (2005)
A bio-surveillance system to collect disease outbreak feedback from public health officials
Variable KD-Tree Algorithms for Spatial Pattern Search and Discovery (2005)
Anomalous Spatial Cluster Detection (2005)
A general and powerful framework for spatial cluster detection.
Finding Optimal Bayesian Networks by Dynamic Programming (2005)
Learning the optimal Bayes net structure
An Investigation of Practical Approximate Nearest Neighbor Algorithms (2004)
How to use variations on classic exact data structures for nearest neighbor, if you want to get faster answers and are prepared to accept approximation?
High-Dimensional Probabilistic Classification for Drug Discovery (2004)
Discriminative probabilistic classifiers have been used successfully on large life-sciences datasets, but high dimensionalities have prohibited the use of nonparametric class probability estimation. This paper explores a method (SLAMDUNK) which addresses
Fast Nonlinear Regression via Eigenimages Applied to Galactic Morphology (2004)
Determining the shapes of millions of galaxies.
Active Learning for Anomaly and Rare-Category Detection (2004)
How to use active learning in a real-life scenario.
Semantic based Biomedical Image Indexing and Retrieval (2004)
Volumetric pathological neuroimage retrieval under the framework of classification-driven feature selection.
The IOC algorithm: Efficient Many-Class Non-parametric Classification for High-Dimensional Data (2004)
Performing k-nearest-neghbor classifications on multi-class problems without actually finding the k-nearest neighbors.
Rapid Detection of Significant Spatial Clusters (2004)
A new spatial scan algorith that searches over arbitrary rectangles in addition to squares.
Alias Detection in Link Data Sets (2004)
An active learning approach to deciding whether two names correspond to the same entity, combining string similarity information and context similarity.
Tractable Learning of Large Bayes Net Structures from Sparse Data (2004)
in this paper we propose an algorithm that allows to learn a Bayes Net structure from sparse data (e.g., power-law distributed) with over 100,000 variables. we also report time and performance accuracy when applied to several very large datasets
Belief State Approaches to Signaling Alarms in Surveillance Systems (2004)
A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters (2003)
Rapid detection of disease clusters using a fast spatial scan statistic algorithm.
Rapid Evaluation of Multiple Density Models (2003)
A way to quickly evaluate and compare multiple nonparametric density estimates.
Efficient Exact k-NN and Nonparametric Classification in High Dimensions (2003)
Can we do non-approximate k-NN classification without actually finding the k-NN?
A Fast Multi-Resolution Method for Detection of Significant Spatial Overdensities (2003)
Scaling up the classical Kulldorff scan statistic.
Bayesian Network Anomaly Pattern Detection for Disease Outbreaks (2003)
Handles temporal trends in data by replacing the baseline of WSARE 2.0 with a baseline generated by a Bayesian network.
Empirical Bayes Screening for Link Analysis (2003)
An algorithm for discovering top N strange co-occurences of size 2,3,4, etc Uses ideas of frequent sets, but stratifies them according to a statistically justified hierarchical bayes model, using empirical bayes to find the parameters
What's Strange About Recent Events (2003)
A shorter paper on WSARE that was submitted to the Journal of Urban Health
Optimal Reinsertion: A new search operator for accelerated and more accurate Bayesian network structure learning (2003)
Very aggressive but computationally efficient search steps for Bayes net learning.
Tractable Group Detection on Large Link Data Sets (2003)
We present the k-groups algorithm, an improvement of the GDA algorithm that includes significant computational advantages. The k-groups algorithm allows tractable group detection on large data sets.
Finding Underlying Connections: A Fast Graph-Based Method for Link Analysis and Collaboration Queries (2003)
CGraph is an algorithm to quickly learn a graph-based model of the underlying connections of a set of entities given link data.
cGraph: A Fast Graph-Based Method for Link Analysis and Queries (2003)
This paper is an extended version of the 2003 ICML conference paper.
Probabilistic Noise Identification and Data Cleaning (2003)
We examine the use of explicit noise and corruption models to aid in the task of noise identification and data cleaning.
A Comparison of Statistical and Machine Learning Algorithms on the Task of Link Completion (2003)
This paper examines the task of link completion, relative algorithm performance, and what this can tell us about the structure of the data.
Fast Robust Logistic Regression for Large Sparse Datasets with Binary Outputs (2003)
Logistic regression can provide faster, better results than SVM for life-sciences datasets with hundreds of thousands of attributes.
Active Learning in Discrete Input Spaces (2002)
Using modified Gittins indices to decide which datapoint to actively label next whilst being rewarded for each new label.
Real-valued All-Dimensions search: Low-overhead rapid searching over subsets of attributes (2002)
Searching over large numbers of contingency tables quickly
Using Tarjan's Red Rule for Fast Dependency Tree Construction (2002)
Very fast growth of dependency trees.
Interpolating Conditional Density Trees (2002)
Very fast non-parametric Bayesian Network nodes
Efficient Algorithms for Non-Parametric Clustering with Clutter (2002)
Finding and counting the high density regions in spatial data.
Stochastic Link and Group Detection (2002)
This paper introduces the GDA algorithm. We use noisy link data (n-tuples of entities) to learn underlying groupings of entities.
Rule-based Anomaly Pattern Detection for Detecting Disease Outbreaks (2002)
Answering the question: "What's Strange About Recent Events?"
Summary of Biosurveillance-relevant statistical and data mining technologies (2002)
A brief informal survey of some techniques that have been used for Biosurveillance.
Variable Resolution Discretization in Optimal Control (2002)
A short paper on choosing the right resolution in a tessalation of state space.
Reinforcement Learning for Cooperating and Communicating Reactive Agents in Electrical Power Grids (2001)
One approach to distributed reinforcement learning
Classification-Driven Pathological Neuroimage Retrieval Using Statistical Asymmetry Measures (2001)
Using machine learning to detect abnormalities in neuro-imaging output.
N-Body Problems in Statistical Learning (2001)
A way to use multiple trees simultaneously to solve a large class of statistical problems efficiently.
Repairing Faulty Mixture Models using Density Estimation (2001)
Intelligent automatic selection of new mixture model components
Mixtures of Rectangles: Interpretable Soft Clustering (2001)
A mixture model that is easily readable by humans.
Direct Policy Search using Paired Statistical Tests (2001)
If you're going to choose the best policy by roll-outs, how can statistical tests and "racing" help?
Mixnets: Learning Bayesian Networks with mixtures of discrete and continuous attributes (2000)
Bayes Nets with Mixture Models in the nodes
Learning Filaments (2000)
A generative model and efficient algorithm for identifying noisy networks of points in k-dimensional space
X-means: Extending K-means with Efficient Estimation of the Number of Clusters (2000)
Extension to popular K-means, where the number of clusters K is also estimated.
A Dynamic Adaptation of AD-trees for Efficient Machine Learning on Large Data Sets (2000)
Fast implementation of on-demand AD-trees for scores of high-arity attributes and millions of rows.
Condensed Representations for computationally tractable data mining of massive sky surveys (2000)
A Nonparametric Approach to Noisy and Costly Optimization (2000)
Optimizing a noisy process in a possibly discontinuous or non-euclidian space.
The Anchors Hierarchy: Using the Triangle Inequality to Survive High-Dimensional Data (2000)
Using Ball-trees allows cached sufficient statistics-based accelerations even in high dimensions. The Anchors approach quickly optimizes the Ball-tree structure for this purpose.
Multi-Value-Functions: Efficient Automatic Action Hierarchies for Multiple Goal MDPs (1999)
An efficient procedure to approximately compute all policies for all possible goal states.
Accelerating Exact k-means Algorithms with Geometric Reasoning (Extended version) (1999)
This is an extended version of the KDD99 paper.
Accelerating Exact k-means Algorithms with Geometric Reasoning (1999)
Using cached counts and a different kind of search operator during k-means updates, with no approximation
Distributed Value Functions (1999)
Distributed Reinforcement learning for applications such a power grids
Variable Resolution discretizations for high-accuracy solutions of optimal control problems (1999)
Influence and Variance of a Markov Chain: Application to adaptive discretization in optimal control (1999)
You're using variable resolution splines to approximate a value function. Where is most profitable to increasing the resolution?
Efficient Multi-Object Dynamic Query Histograms (1999)
Using Multiresolution kdtrees to accelerate visualization algorithms
Bayesian Networks for Lossless Dataset Compression (1999)
Practical ways to compress large tabular datasets
Very Fast EM-based Mixture Model Clustering Using Multiresolution KD-trees (1999)
Using kdtrees with centroids in the nodes can allow accurate EM updates in time sublinear in the number of records
Value Function Based Production Scheduling (1998)
Production scheduling in which we account for a probability distribution on future jobs by means of kernel-based value function approximation
Q2: Memory-based active learning for optimizing noisy continuous functions (1998)
Maximizing a very noisy function in k-dimensional space with few samples
Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets (1998)
Introduces AD-trees: a new way to implicitly pre-cache the answers to all possible counting queries for a dataset.
Learning Evaluation Functions for Global Optimization and Boolean Satisfiability (1998)
AD-trees for Fast Counting and for Fast Learning of Association Rules (1998)
Using AD-trees to learn conjunctive rules via beam search.
Applying online search techniques to Continuous-State reinforcement learning (1998)
Using a specialized version of A-star to boost the performance of approximate value functions
The Racing Algorithm: Model Selection for Lazy Learners (1997)
A detailed analysis and study of Racing.
Efficient Locally Weighted Polynomial Regression Predictions (1997)
Using Multiresolution KD-trees with cached first and second moments in the nodes
Locally Weighted Learning (1997)
Survey of the use of kernel functions in kernel regression, locally weighted regression and related function approximators.
Using Prediction to Improve Combinatorial Optimization Search (1997)
Automatically improving combinatorial search by reinforcement-learning-style analysis of earlier runs
Locally Weighted Learning for Control (1997)
How can kernel methods and locally weighted regression help robots learn to control themselves?
A tutorial on using the Vizier memory-based learning system (1997)
A tutorial on using the Windows Vizier software for fast locally weighted and k-NN style classification and regression.
Simulation-based optimization of a stochastic product coating problem using hash-table cacheing of costs (1997)
Memory-based Stochastic Optimization (1996)
Using locally weighted regression to model response surfaces and to choose the next experiment
Reinforcement Learning: A Survey (1996)
Surveys MDPs, TD, Q-learning and many other Reinforcement Learning staples.
Algorithms for Approximating Optimal Value Functions in Acyclic Domains (1996)
Using "Rollouts" to make value-function-based RL more practical
Multiresolution Instance-based Learning (1995)
Multiresolution Kd-trees with cached statistics for accelerating kernel regression
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces (1995)
Automatic variable resolution discretization of a multidimensional state space during searches for shortest paths
Learning Automated Product Recommendations Without Observable Features: An Initial Investigation (1995)
Generalization in Reinforcement Learning: Safely Approximating the Value Function (1995)
An introduction to the ways that naive application of function approximation of value functions can fail.
An Empirical Investigation of Brute Force to choose Features, Smoothers and Function Approximators (1995)
What happens when you use very intense cross-validation?
Proceedings of the Workshop on Value Function Approximation, Machine Learning Conference 1995. (1995)
Short talks from a workshop about value function approximation
Variable Resolution Reinforcement Learning (1995)
Briefly surveys a number of approaches to making Bellman updates faster
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces (1994)
A short introduction to an efficient learning-shortest-paths algorithm.
Efficient Algorithms for Minimizing Cross Validation Error (1994)
Hoeffding Races: Accelerating Model Selection Search for Classification and Function Approximation (1994)
Perform many cross-validation operations in a round-robin fashion, pruning likely non-winners early.
A short tutorial note on computing information gain from counts (1994)
A simple 2-page tutorial.
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Real Time (1993)
As estimates of rewards and transition probabilites improve during learning, how can we efficiently allow the value function to keep up?
Memory-based Reinforcement Learning: Efficient Computation with Prioritized Sweeping (1992)
Using a priority queue to schedule the most useful value function updates
Fast, Robust Adaptive Control by Learning only Forward Models (1992)
A real robot pool player achieves high accuracy by learning in a forward direction.
A tutorial on kd-trees (1991)
A description of Bentley et al's classic nearest neighbor algorithm
Variable Resolution Dynamic Programming: Efficiently Learning Action Maps in Multivariate Real-valued State-spaces (1991)
Knowledge of Knowledge and Intelligent Experimentation for Learning Control (1991)
Efficient Memory-based Learning for Robot Control (1990)
Using KD-trees, nearest neighbor and active learning.
Acquisition of Dynamic Control Knowledge for a Robotic Manipulator (1990)
Experiments in Adaptive State Space Robotics (1989)
Using a nearest neighbor classifier to design control experiments
Learning Robotic Control: PhD. Thesis Proposal (1988)
Thesis Proposal