CSAIL Digital Archive

On the Complexity of Neural Computation in Superposition

2024年9月30日 00:00:00 GMT

On the Complexity of Neural Computation in Superposition Adler, Micah; Shavit, Nir Recent advances in the understanding of neural networks suggest that superposition, the ability of a single neuron to represent multiple features simultaneously, is a key mechanism underlying the computational efficiency of large-scale networks. This paper explores the theoretical foundations of computing in superposition, focusing on explicit, provably correct algorithms and their efficiency. We present the first lower bounds showing that for a broad class of problems, including permutations and pairwise logical operations, a neural net- work computing in superposition requires at least Ω(m′ log m′) parameters and Ω(√(m′ log m′)) neurons, where m′ is the number of output features being computed. This implies that any "lottery ticket" sparse sub-network must have at least Ω(m′ log m′ ) parameters no matter what the initial dense network size. Conversely, we show a nearly tight upper bound: logical operations like pair- wise AND can be computed using O(√(m′) log m′) neurons and O(m′ log^2 m′) parameters. There is thus an exponential gap between computing in superposition, the subject of this work, and representing features in superposition, which can require as little as O(log m′) neurons based on the Johnson-Lindenstrauss Lemma. Our hope is that our results open a path for using complexity theoretic techniques in neural network interpretability research.

Belief Programming Implementation

2023年11月27日 00:00:00 GMT

Belief Programming Implementation Atkinson, Eric

Speranza: Usable, privacy-friendly software signing

2023年9月19日 00:00:00 GMT

Speranza: Usable, privacy-friendly software signing Merrill, Kelsey; Newman, Zachary; Torres-Arias, Santiago; Sollins, Karen Software repositories, used for wide-scale open software distribu- tion, are a significant vector for security attacks. Software signing provides authenticity, mitigating many such attacks. Developer- managed signing keys pose usability challenges, but certificate- based systems introduce privacy problems. This work, Speranza, uses certificates to verify software authenticity but still provides anonymity to signers using zero-knowledge identity co-commitments. In Speranza, a signer uses an automated certificate authority (CA) to create a private identity-bound signature and proof of authoriza- tion. Verifiers check that a signer was authorized to publish a pack- age without learning the signer’s identity. The package repository privately records each package’s authorized signers, but publishes only commitments to identities in a public map. Then, when issuing certificates, the CA issues the certificate to a distinct commitment to the same identity. The signer then creates a zero-knowledge proof that these are identity co-commitments. We implemented a proof-of-concept for Speranza. We find that costs to maintainers (signing) and end users (verifying) are small (sub-millisecond), even for a repository with millions of packages. Techniques inspired by recent key transparency systems reduce the bandwidth for serving authorization policies to 2 KiB. Server costs in this system are negligible. Our evaluation finds that Speranza is practical on the scale of the largest software repositories. We also emphasize practicality and deployability in this project. By building on existing technology and employing relatively sim- ple and well-established cryptographic techniques, Speranza can be deployed for wide-scale use with only a few hundred lines of code and minimal changes to existing infrastructure. Speranza is a practical way to bring privacy and authenticity together for more trustworthy open-source software. This is an extended version of the shorter paper by the same name, published in ACM CCS 2023.

How Can Large Language Models Help Humans in Design And Manufacturing?

2023年7月27日 00:00:00 GMT

How Can Large Language Models Help Humans in Design And Manufacturing? Makatura, Liane; Foshey, Michael; Wang, Bohan; Hähnlein, Felix; Ma, Pingchuan; Deng, Bolei; Tjandrasuwita, Megan; Spielberg, Andrew; Owens, Crystal Elaine; Chen, Peter Yichen; Zhao, Allan; Zhu, Amy; Norton, Wil J; Gu, Edward; Jacob, Joshua; Li, Yifei; Schulz, Adriana; Matusik, Wojciech

Counterfactual Explanations and Predictive Models to Enhance Clinical Decision-Making in Schizophrenia using Digital Phenotyping

2023年6月15日 00:00:00 GMT

Counterfactual Explanations and Predictive Models to Enhance Clinical Decision-Making in Schizophrenia using Digital Phenotyping Canas, Juan Sebastian; Gomez, Francisco; Costilla Reyes, Omar Clinical practice in psychiatry is burdened with the increased demand for healthcare services and the scarce resources available. New paradigms of health data powered with machine learning techniques could open the possibility to improve clinical workflow in critical stages of clinical assessment and treatment in psychiatry. In this work, we propose a machine learning system capable of predicting, detecting, and explaining individual changes in symptoms of patients with Schizophrenia by using behavioral digital phenotyping data. We forecast symptoms of patients with an error rate below 10%. The system detects decreases in symptoms using changepoint algorithms and uses counterfactual explanations as a recourse in a simulated continuous monitoring scenario in healthcare. Overall, this study offers valuable insights into the performance and potential of counterfactual explanations, predictive models, and change-point detection within a simulated clinical workflow. These findings lay the foundation for further research to explore additional facets of the workflow, aiming to enhance its effectiveness and applicability in real-world healthcare settings. By leveraging these components, the goal is to develop an actionable, interpretable, and trustworthy integrative decision support system that combines real-time clinical assessments with sensor-based inputs.

Automated Exposure Notification for COVID-19

2023年2月22日 00:00:00 GMT

Automated Exposure Notification for COVID-19 Rivest, Ronald; Schiefelbein, M. Curran; Zissman, Marc A.; Bay, Jason; Bugnion, Edouard; Finnerty, Jill; Liccardi, Ilaria; Nelson, Brad; Norige, Adam S.; Shen, Emily H.; Wanger, Jenny; Yahalom, Raphael; Alekseyev, Jesslyn D.; Brubaker, Chad; Ferretti, Luca; Ishikawa, Charlie; Raykova, Mariana; Schlaman, Brendan; Schwartz, Robert X.; Sudduth, Emma; Tessaro, Stefano Private Automated Contact Tracing (PACT) was a collaborative team and effort formed during the beginning of the Coronavirus Disease 2019 (COVID-19) pandemic. PACT’s mission was to enhance contact tracing in pandemic response by designing exposure-detection functions in personal digital communication devices that have maximal public health utility while preserving privacy. PACT had four major lines of effort: proximity detection efficacy, privacy, public health integration, and public health efficacy. In support of these lines of effort, PACT executed several cross-layer activities that helped demonstrate public health efficacy. These included prototype development and demonstrations; system analysis; data collection and experimentation; and large-scale deployment support. PACT convened two scientific workshops relating to privacy-preserving AEN: one virtual workshop in April 2020 and a second hybrid workshop in October 2021. This report is an outcome of the second workshop and serves as PACT’s final report. It seeks to explain and discuss the use of automated exposure notification during the COVID-19 pandemic and to provide some recommendations for those who may try to design and deploy similar technologies in future pandemics. The authors were among the 70+ in-person and virtual participants in the October 2021 ImPACT 2021 workshop. This final report has been heavily influenced by the discussion at that workshop.

Neurosymbolic Programming for Science

2022年10月12日 00:00:00 GMT

Neurosymbolic Programming for Science Sun, Jennifer J; Tjandrasuwita, Megan; Sehgal, Atharva; Solar-Lezama, Armando; Chaudhuri, Swarat; Yue, Yisong; Costilla Reyes, Omar Neurosymbolic Programming (NP) techniques have the potential to accelerate scientific discovery across fields. These models combine neural and symbolic components to learn complex patterns and representations from data, using high-level concepts or known constraints. As a result, NP techniques can interface with symbolic domain knowledge from scientists, such as prior knowledge and experimental context, to produce interpretable outputs. Here, we identify opportunities and challenges between current NP models and scientific workflows, with real-world examples from behavior analysis in science. We define concrete next steps to move the NP for science field forward, to enable its use broadly for workflows across the natural and social sciences.

Multi-modal and Inertial sensor Solutions for Navigation-type Factor Graphs

2017年8月31日 00:00:00 GMT

Multi-modal and Inertial sensor Solutions for Navigation-type Factor Graphs Fourie, Dehann This thesis presents a sum-product inference algorithm for platform navigation called Multi-modal iSAM (incremental smoothing and mapping). Common Gaussian only likelihoods are restrictive and require a complex front-end processes to deal with non-Gaussian measurements. Instead, our approach allows the front-end to defer ambiguities with non-Gaussian measurement models. We retain the acyclic Bayes tree (and incremental update strategy) from the predecessor iSAM2 max-product algorithm [Kaess et al., IJRR 2012]. The approach propagates continuous beliefs on the Bayes (Junction) tree, which is an efficient symbolic refactorization of the nonparametric factor graph, and asymptotically approximates the underlying Chapman-Kolmogorov equations. Our method tracks dominant modes in the marginal posteriors of all variables with minimal approximation error, while suppressing almost all low likelihood modes (in a non-permanent manner). Keeping with existing inertial navigation, we present a novel, continuous-time, retroactively calibrating inertial odometry residual function, using preintegration to seamlessly incorporate pure inertial sensor measurements into a factor graph. We centralize around a factor graph (with starved graph databases) to separate elements of the navigation into an ecosystem of processes. Practical examples are included, such as how to infer multi-modal marginal posterior belief estimates for ambiguous loop closures; raw beam-formed acoustic measurements; or conventional parametric likelihoods, and others.

Universal Motion Generator: Trajectory Autocompletion by Motion Prompts

2022年6月15日 00:00:00 GMT

Universal Motion Generator: Trajectory Autocompletion by Motion Prompts Wang, Yanwei; Shah, Julie Foundation models, which are large neural networks trained on massive datasets, have shown impressive generalization in both the language and the vision domain. While fine-tuning foundation models for new tasks at test-time is impractical due to billions of parameters in those models, prompts have been employed to re-purpose models for test-time tasks on the fly. In this report, we ideate the equivalent foundation model for motion generation and the corresponding formats of prompt that can condition such a model. The central goal is to learn a behavior prior for motion generation that can be re-used in a novel scene.

Active Loop Detection for Applications that Access Databases

2021年11月15日 00:00:00 GMT

Active Loop Detection for Applications that Access Databases Shen, Jiasi; Rinard, Martin We present Shear, a new system that observes and manipulates the interaction between an application and its surrounding environment to learn a model of the behavior of the application. Shear implements active loop detection to infer the loop structures in the application. This technique repeatedly presents the application with the same input, altering the program's interaction with the environment at precisely chosen execution points to elicit different program behaviors depending on the loop structure in the application. The ability to alter interactions between the application and the environment enables Shear to infer a broader range of loop structures otherwise undetectable given only the ability to observe application behavior. Active loop detection therefore enables Shear to infer a broader range of loop structures than previous approaches.

Active Loop Detection for Applications that Access Databases

2021年9月09日 00:00:00 GMT

Active Loop Detection for Applications that Access Databases Shen, Jiasi; Rinard, Martin We present Shear, a new system that observes and manipulates the interaction between an application and its surrounding environment to learn a model of the behavior of the application. Shear implements active loop detection to infer the looping structure in the application. This technique repeatedly presents the application with the same input, altering the program's interaction with the environment at precisely chosen execution points to elicit different program behaviors depending on the loop structure in the application. The ability to alter interactions between the application and the environment enables Shear to infer a broader range of looping structures otherwise undetectable given only the ability to observe application behavior. Active loop detection therefore enables Shear to infer a broader range of looping structures than previous approaches.

Bucket Elimination Algorithm for Dynamic Controllability Checking of Simple Temporal Networks with Uncertainty

2021年3月02日 00:00:00 GMT

Bucket Elimination Algorithm for Dynamic Controllability Checking of Simple Temporal Networks with Uncertainty Zhang, Yuening Simple Temporal Networks with Uncertainty (STNU) can represent temporal problems where duration between events may be uncontrollable, e.g. when the event is caused by nature. An STNU is dynamically controllable (DC) if it can be successfully scheduled online. In this paper, we introduce a novel usage of bucket elimination algorithms for DC checking that matches the state of the art in achieving O(n^3) performance. Bucket elimination algorithms exist for STNs (path consistency and Fourier algorithms), but adapting it to STNUs is non-trivial. As a result, consistency checking becomes a special case of our algorithm. Due to the familiarity to bucket elimination algorithms, the final algorithm is easier to understand and implement. Additionally, conflict extraction is also easily supported in this framework.

Lower Bounds on the Column Sparsity of Compressed Sensing Matrices

2021年3月02日 00:00:00 GMT

Lower Bounds on the Column Sparsity of Compressed Sensing Matrices Nachin, Mergen

Comprehensive Java Metadata Tracking for Attack Detection and Repair

2019年11月19日 00:00:00 GMT

Comprehensive Java Metadata Tracking for Attack Detection and Repair Perkins, Jeff; Eikenberry, Jordan; Coglio, Alessandro; Rinard, Martin We present ClearTrack, a system that tracks 32 bits of metadata for each primitive value in Java programs to detect and nullify a range of vulnerabilities such as integer overflow and underflow vulnerabilities, SQL injection vulnerabilities, and command injection vulnerabilities. Contributions include new techniques for eliminating false positives associated with benign integer overflows and underflows, new metadata-aware techniques for detecting and nullifying SQL and command injection attacks, and results from an evaluation of ClearTrack performed by a Test and Evaluation team hired by the sponsor of this research (an anonymous agency of the United States government). These results show that 1) ClearTrack operates successfully on Java programs comprising hundreds of thousands of lines of code (including instrumented jar files and Java system libraries, the majority of the applications comprise over 3 million lines of code), 2) because of computations such as cryptography and hash table calculations, these applications perform millions of benign integer overflows and underflows, and 3) ClearTrack successfully detects and nullifies all tested integer overflow and underflow, SQL injection, and command injection vulnerabilities in the benchmark applications.

Precise and Comprehensive Provenance Tracking for Android Devices

2019年11月19日 00:00:00 GMT

Precise and Comprehensive Provenance Tracking for Android Devices Gordon, Michael; Eikenberry, Jordan; Eden, Anthony; Perkins, Jeff; Rinard, Martin Detailed information about the paths that data take through a system is invaluable for understanding sources and behaviors of complex exfiltration malware. We present a new system, ClearScope, that tracks, at the level of individual bytes, the complete paths that data follow through Android systems. These paths include the original source where data entered the device (such as sensors or network connections), files in which the data was temporarily stored, applications that the data traversed during its time in the device, and sinks through which the data left the device. The ClearScope system design enables this unprecedented level of provenance tracking detail by 1) structuring the provenance representation as references, via provenance tags, to provenance events that record the movement of data between system components and into or out of the device and 2) adopting a split design in which provenance events are streamed to a remote server for storage, with only the minimal information required to generate the tagged stream of events retained on the device. ClearScope also includes compiler optimizations that enable efficient provenance tracking within applications by eliminating unnecessary provenance tracking computations and adopting and efficient aggregate provenance representation for arrays when all array elements have the same provenance. Experience using ClearScope to analyze the notorious Adups FOTA malware highlights the significant benefits that this level of comprehensive detail can bring. Performance experiments with the Caffeine Mark benchmarks show that the overall ClearScope provenance tracking overhead on this benchmark suite is 14%.

Faster Dynamic Controllability Checking in Temporal Networks with Integer Bounds

2019年8月01日 00:00:00 GMT

Faster Dynamic Controllability Checking in Temporal Networks with Integer Bounds Bhargava, Nikhil; Williams, Brian C. Simple Temporal Networks with Uncertainty (STNUs) provide a useful formalism with which to reason about events and the temporal constraints that apply to them. STNUs are in particular notable because they facilitate reasoning over stochastic, or uncontrollable, actions and their corresponding durations. To evaluate the feasibility of a set of constraints associated with an STNU, one checks the network's \textit{dynamic controllability}, which determines whether an adaptive schedule can be constructed on-the-fly. Our work provides a dynamic controllability checker that is able to quickly refute the controllability of an STNU with integer bounds, such as those found in planning problems. Our work is faster than the existing best runtime for networks with integer bounds and executes in O(min(mn, m\sqrt{n}\log{N}) + km + k^2n + kn\log{n}). Our approach pre-processes the STNU using an existing O(n^3) dynamic controllability checking algorithm and provides tighter bounds on its runtime. This makes our work easily adaptable to other algorithms that rely on checking variants of dynamic controllability.

Automatic Exploitation of Fully Randomized Executables

2019年6月11日 00:00:00 GMT

Automatic Exploitation of Fully Randomized Executables Gadient, Austin; Ortiz, Baltazar; Barrato, Ricardo; Davis, Eli; Perkins, Jeff; Rinard, Martin We present Marten, a new end to end system for automatically discovering, exploiting, and combining information leakage and buffer overflow vulnerabilities to derandomize and exploit remote, fully randomized processes. Results from two case studies high- light Marten’s ability to generate short, robust ROP chain exploits that bypass address space layout randomization and other modern defenses to download and execute injected code selected by an attacker. We present an automated system, Marten, that automatically generates control flow hijacking exploits against fully randomized executables by combining information leakage and buffer overflow exploits.

Gen: A General-Purpose Probabilistic Programming System with Programmable Inference

2018年11月26日 00:00:00 GMT

Gen: A General-Purpose Probabilistic Programming System with Programmable Inference Cusumano-Towner, Marco F.; Saad, Feras A.; Lew, Alexander; Mansinghka, Vikash K. Probabilistic modeling and inference are central to many fields. A key challenge for wider adoption of probabilistic programming languages is designing systems that are both flexible and performant. This paper introduces Gen, a new probabilistic programming system with novel language con- structs for modeling and for end-user customization and optimization of inference. Gen makes it practical to write probabilistic programs that solve problems from multiple fields. Gen programs can combine generative models written in Julia, neural networks written in TensorFlow, and custom inference algorithms based on an extensible library of Monte Carlo and numerical optimization techniques. This paper also presents techniques that enable Gen’s combination of flexibility and performance: (i) the generative function inter- face, an abstraction for encapsulating probabilistic and/or differentiable computations; (ii) domain-specific languages with custom compilers that strike different flexibility/per- formance tradeoffs; (iii) combinators that encode common patterns of conditional independence and repeated compu- tation, enabling speedups from caching; and (iv) a standard inference library that supports custom proposal distributions also written as programs in Gen. This paper shows that Gen outperforms state-of-the-art probabilistic programming systems, sometimes by multiple orders of magnitude, on problems such as nonlinear state-space modeling, structure learning for real-world time series data, robust regression, and 3D body pose estimation from depth images.

Towards Understanding Generalization via Analytical Learning Theory

2018年10月01日 00:00:00 GMT

Towards Understanding Generalization via Analytical Learning Theory Kawaguchi, Kenji; Benigo, Yoshua; Verma, Vikas; Kaelbling, Leslie Pack This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions. Based on this theory, a new regularization method in deep learning is derived and shown to outperform previous methods in CIFAR-10, CIFAR-100, and SVHN. Moreover, the proposed theory provides a theoretical basis for a family of practically successful regularization methods in deep learning. We discuss several consequences of our results on one-shot learning, representation learning, deep learning, and curriculum learning. Unlike statistical learning theory, the proposed learning theory analyzes each problem instance individually via measure theory, rather than a set of problem instances via statistics. As a result, it provides different types of results and insights when compared to statistical learning theory.

Using Dynamic Monitoring to Synthesize Models of Applications That Access Databases

2018年9月27日 00:00:00 GMT

Using Dynamic Monitoring to Synthesize Models of Applications That Access Databases Shen, Jiasi; Rinard, MArtin We previously developed Konure, a tool that uses active learning to infer the functionality of database applications. An alternative approach is to observe the inputs, outputs, and database traffic from a running system in normal use and then synthesize a model of the application from this information. To evaluate these two approaches, we present Etch, which uses information from typical usage scenarios to synthesize a model of the functionality of database applications whose computation can be expressed in the Konure DSL.

Using Active Learning to Synthesize Models of Applications That Access Databases

2018年8月28日 00:00:00 GMT

Using Active Learning to Synthesize Models of Applications That Access Databases Shen, Jiasi; Rinard, Martin We present a new technique that uses active learning to infer models of applications that manipulate relational databases. This technique comprises a domain-specific language for modeling applications that access databases (each model is a program in this language) and an associated inference algorithm that infers models of applications whose behavior can be expressed in this language. The inference algorithm generates test inputs and database configurations, runs the application, then observes the resulting database traffic and outputs to progressively refine its current model hypothesis. The end result is a model that completely captures the behavior of the application. Because the technique works only with the externally observable inputs, outputs, and databases, it can infer the behavior of applications written in arbitrary languages using arbitrary coding styles (as long as the behavior of the application is expressible in the domain-specific language). We also present a technique for automatically regenerating an implementation from the inferred model. The regenerator can produce a translated implementation in a different language and systematically include relevant security and error checks.

Data and Code for "A New Approach to Animacy Detection"

2018年6月07日 00:00:00 GMT

Data and Code for "A New Approach to Animacy Detection" Labiba, Jahan,; Geeticka, Chauhan,; A., Finlayson, Mark This archive contains the code and data for the workshop article "A New Approach to Animacy Detection," published in 2018 in the the 27th International Conference on Computational Linguistics (COLING 2018), in Santa Fe, NM. The root of the archive contains a readme file which explains the archive contents. Furthermore, the archive can be imported directly into the Eclipse IDE as a project encapsulating the executable code and data required to reproduce the results of the paper; the code compiles with Java 1.8. The archive also contains a copy of the near-final version of the paper for reference.

Best-first Enumeration Based on Bounding Conflicts, and its Application to Large-scale Hybrid Estimation

2018年5月24日 00:00:00 GMT

Best-first Enumeration Based on Bounding Conflicts, and its Application to Large-scale Hybrid Estimation Timmons, Eric; Williams, Brian C. With the rise of autonomous systems, there is a need for them to have high levels of robustness and safety. This robustness can be achieved through systems that are self-repairing. Underlying this is the ability to diagnose subtle failures. Likewise, online planners can generate novel responses to exceptional situations. These planners require an accurate estimate of state. Estimation methods based on hybrid discrete/continuous state models have emerged as a method of computing precise state estimates, which can be employed for either diagnosis or planning in hybrid domains. However, existing methods have difficulty scaling to systems with more than a handful of components. Discrete state estimation capabilities can scale to this level by combining best-first enumeration and conflict-directed search. Best-first methods have been developed for hybrid estimation, but the creation of conflict-directed methods has previously been elusive. While conflicts are used to learn from constraint violation, probabilistic hybrid estimation is relatively unconstrained. In this paper we present an approach to hybrid estimation that unifies best-first enumeration and conflict-directed search through the concept of "bounding" conflicts, an extension of conflicts that represent tighter bounds on the cost of regions of the search space. This paper presents a general best-first search and enumeration algorithm based on bounding conflicts (A*BC) and a hybrid estimation method based on this enumeration algorithm. Experiments show that an A*BC powered state estimator produces estimates faster than the current state of the art, particularly on large systems.

Learning Models of Sequential Decision-Making without Complete State Specification using Bayesian Nonparametric Inference and Active Querying

2018年5月17日 00:00:00 GMT

Learning Models of Sequential Decision-Making without Complete State Specification using Bayesian Nonparametric Inference and Active Querying Unhelkar, Vaibhav V.; Shah, Julie A. Learning models of decision-making behavior during sequential tasks is useful across a variety of applications, including human-machine interaction. In this paper, we present an approach to learning such models within Markovian domains based on observing and querying a decision-making agent. In contrast to classical approaches to behavior learning, we do not assume complete knowledge of the state features that impact an agent's decisions. Using tools from Bayesian nonparametric inference and time series of agents decisions, we first provide an inference algorithm to identify the presence of any unmodeled state features that impact decision making, as well as likely candidate models. In order to identify the best model among these candidates, we next provide an active querying approach that resolves model ambiguity by querying the decision maker. Results from our evaluations demonstrate that, using the proposed algorithms, an observer can identify the presence of latent state features, recover their dynamics, and estimate their impact on decisions during sequential tasks.

Generalization in Deep Learning

2018年5月01日 00:00:00 GMT

Generalization in Deep Learning Kawaguchi, Kenji; Kaelbling, Leslie Pack; Bengio, Yoshua With a direct analysis of neural networks, this paper presents a mathematically tight generalization theory to partially address an open problem regarding the generalization of deep learning. Unlike previous bound-based theory, our main theory is quantitatively as tight as possible for every dataset individually, while producing qualitative insights competitively. Our results give insight into why and how deep learning can generalize well, despite its large capacity, complexity, possible algorithmic instability, nonrobustness, and sharp minima, answering to an open question in the literature. We also discuss limitations of our results and propose additional open problems.

A Natural Language Interface for Mobile Devices

2018年3月01日 00:00:00 GMT

A Natural Language Interface for Mobile Devices Katz, Boris; Borchardt, Gary; Felshin, Sue; Mora, Federico Creating a robust, automated capability to respond to natural language requests has been a longstanding goal in the development of intelligent systems. This article describes the StartMobile system, originally developed in 2005-2007, which has served as an important precursor to Apple's Siri system and other commercial natural language interfaces to mobile devices and computational resources. The article begins with a discussion of goals in creating natural language interfaces, continues with a description of the general-purpose START information access system, describes the StartMobile system and its capabilities, and concludes with a discussion of current commercial systems and future directions.

continuous Relaxation to Over-constrained Temporal Plans

2013年1月25日 00:00:00 GMT

continuous Relaxation to Over-constrained Temporal Plans Yu, Peng When humans fail to understand the capabilities of an autonomous system or its environmental limitations, they can jeopardize their objectives and the system by asking for unrealistic goals. The objective of this thesis is to enable consensus between human and autonomous system, by giving autonomous systems the ability to communicate to the user the reasons for goal failure and the relaxations to goals that archive feasibility. We represent our problem in the context of over-constrained temporal plans. They are commonly encountered while operating autonomous and decision support systems, when user objectives are in conflict with the environment. Over constrained plans are addressed by relaxing goals and or constraints, such as delaying the arrival time of a trip, with some candidate relaxations being preferable to others. In this thesis we present Uhura, a temporal plan diagnosis and relaxation algorithm that is designed to take over-constrained input plans with temporal flexibility and contingencies, and generate temporal relaxations that make the input plan executable. We introduce two innovative approaches within Uhura: collaborative plan diagnosis and continuous relaxation. Uhura focuses on novel ways of satisfying three goals to make the plan relaxation process more convenient for the users: small perturbation, quick response and simple interaction. We have incorporated Uhura within an autonomous executive that collaborates with human operators to resolve over-constrained temporal plans. Its effectiveness has been demonstrated both in simulation and in hardware on a Personal Transportation System concept. We believe that Uhura's collaborative temporal plan diagnosis capability can benefit a wide range of applications, both within industrial applications and in our daily lives. SM thesis

Risk Allocation for Temporal Risk Assessment

2013年1月31日 00:00:00 GMT

Risk Allocation for Temporal Risk Assessment Wang, Andrew J. Temporal uncertainty arises when performing any activity in the natural world. When activities are composed into temporal plans, then, there is a risk of not meeting the plan requirements. Currently, we do not have quantitatively precise methods for assessing temporal risk of a plan. Existing methods that deal with temporal uncertainty either forgo probabilistic models or try to optimize a single objective, rather than satisfy multiple objectives. This thesis offers a method for evaluating whether a schedule exists that meets a set of temporal constraints, with acceptable risk of failure. Our key insight is to assume a form of risk allocation to each source of temporal uncertainty in our plan, such that we may reformulate the probabilistic plan into an STNU parameterized on the risk allocation. We show that the problem becomes a deterministic one of finding a risk allocation which implies a schedulable STNU within acceptable risk. By leveraging the principles behind STNU analysis, we derive conditions which encode this problem as a convex feasibility program over risk allocations. Furthermore, these conditions may be learned incrementally as temporal conflicts. Thus, to boost computational efficiency, we employ a generate-and-test approach to determine whether a schedule may be found. MEng thesis

Energy-efficient Control of a Smart Grid with Sustainable Homes based on Distributing Risk

2012年1月20日 00:00:00 GMT

Energy-efficient Control of a Smart Grid with Sustainable Homes based on Distributing Risk Ono, Masahiro The goal of this thesis is to develop a distributed control system for a smart grid with sustainable homes. A central challenge is how to enhance energy efficiency in the presence of uncertainty. A major source of uncertainty in a smart grid is intermittent energy production by renewable energy sources. In the face of global climate change, it is crucial to reduce dependence on fossil fuels and shift to renewable energy sources, such as wind and solar. However, a large-scale introduction of wind and solar generation to an electrical grid poses a significant risk of blackouts since the energy supplied by the renewables is unpredictable and intermittent. The uncertain behavior of renewable energy sources increases the risk of blackouts. Therefore, an important challenge is to develop an intelligent control mechanism for the electrical grid that is both reliable and efficient. Uncertain weather conditions and human behavior pose challenges for a smart home. For example, autonomous room temperature control of a residential building may occasionally make the room environment uncomfortable for residents. Autonomous controllers must be able to take residents' preferences as an input, and to control the indoor environment in an energy-efficient manner while limiting the risk of failure to meet the residents' requirements in the presence of uncertainties. In order to overcome these challenges, we propose a distributed robust control method for a smart grid that includes smart homes as its building components. The proposed method consists of three algorithms: 1) market-based contingent energy dispatcher for an electrical grid, 2) a risk-sensitive plan executive for temperature control of a residential building, and 3) a chance-constrained model-predictive controller with a probabilistic guarantee of constraint satisfaction, which can control continuously operating systems such as an electrical grid and a building. We build the three algorithms upon the chance-constrained programming framework: minimization of a given cost function with chance constraints, which bound the probability of failure to satisfy given state constraints. Although these technologies provide promising capabilities, they cannot contribute to sustainability unless they are accepted by the society. In this thesis we specify policy challenges for a smart grid and a smart home, and discuss policy options that gives economical and regulatory incentives for the society to introduce these technologies on a large scale. SM thesis

Robust, Goal-directed Plan Execution with Bounded Risk

2012年2月02日 00:00:00 GMT

Robust, Goal-directed Plan Execution with Bounded Risk Ono, Masahiro There is an increasing need for robust optimal plan execution for multi-agent systems in uncertain environments, while guaranteeing an acceptable probability of success. For ex- ample, a fleet of unmanned aerial vehicles (UAVs) and autonomous underwater vehicles (AUVs) are required to operate autonomously for an extensive mission duration in an uncertain environment. Previous work introduced the concept of a model-based executive, which increases the level of autonomy, elevating the level at which systems are commanded. This thesis develops model-based executives that reason explicitly from a stochastic plant model to find the optimal course of action, while ensuring that the probability of failure is within a user-specified risk bound. This thesis presents two robust mode-based executives: probabilisticSulu orp-Sulu, and distributedprobabilisticSulu or dp-Sulu. The objective for p-Sulu and dp-Sulu is to allow users to command continuous, stochastic multi-agent systems in a manner that is both intuitive and safe. The user specifies the desired evolution of the plant state, as well as the acceptable probabilities of failure, as a temporal plan on states called a chance-constrained qualitative state plan (CCQSP). An example of a CCQSP statement is "go to A through B within 30 minutes, with less than 0.001% probability of failure." p-Sulu and dp-Sulu take a CCQSP, a continuous plant model with stochastic uncertainty, and an objective function as inputs, and outputs an optimal continuous control sequence, as well as an optimal discrete schedule. The difference between p-Sulu and dp-Sulu is that p-Sulu plans in a centralized manner while dp-Sulu plans in a distributed manner. dp-Sulu enables robust CCQSP execution for multi-agent systems. We solve the problem based on the key concept of risk allocation, which achieves tractability by allocating the specified risk to individual constraints and mapping the result into an equivalent deterministic constrained optimization problem. Risk allocation also enables a distributed plan execution for multi-agent systems by distributing the risk among agents to decompose the optimization problem. Building upon the risk allocation approach, we develop our first CCQSP executive, p-Sulu, in four spirals. First, we develop the Convex Risk Allocation (CRA) algorithm, which can solve a CCQSP planning problem with a convex state space and a fixed schedule, highlighting the capability of optimally allocating risk to individual constraints. Second, we develop the Non-convex Iterative Risk Allocation (NIRA) algorithm, which can handle non-convex state space. Third, we build upon NIRA a full-horizon CCQSP planner, p-Sulu FH, which can optimize not only the control sequence but also the schedule. Fourth, we develop p-Sulu, which enables the real-time execution of CCQSPs by employing the receding horizon approach. Our second CCQSP executive, dp-Sulu, is developed in two spirals. First, we develop the Market-based Iterative Risk Allocation (MIRA) algorithm, which can control a multi-agent system in a distributed manner by optimally distributing risk among agents through the market-based method called tatonnement. Second and finally, we integrate the capability of MIRA into p-Sulu to build the robust model-based executive, dp-Sulu, which can execute CCQSPs on multi-agent systems in a distributed manner. Our simulation results demonstrate that our executives can efficiently execute CCQSP planning problems with significantly reduced suboptimality compared to prior art. PhD thesis

Unsupervised Learning and Recognition of Physical Activity Plans

2007年8月23日 00:00:00 GMT

Unsupervised Learning and Recognition of Physical Activity Plans Dong, Shuonan This thesis desires to enable a new kind of interaction between humans and computational agents, such as robots or computers, by allowing the agent to anticipate and adapt to human intent. In the future, more robots may be deployed in situations that require collaboration with humans, such as scientific exploration, search and rescue, hospital assistance, and even domestic care. These situations require robots to work together with humans, as part of a team, rather than as a stand-alone tool. The intent recognition capability is necessary for computational agents to play a more collaborative role in human-robot interactions, moving beyond the standard master-slave relationship of humans and computers today. We provide an innovative capability for recognizing human intent, through statistical plan learning and online recognition. We approach the plan learning problem by employing unsupervised learning to automatically determine the activities in a plan based on training data. The plan activities are described by a mixture of multivariate probability densities. The number of distributions in the mixture used to describe the data is assumed to be given. The training data trajectories are fed again through the activities' density distributions to determine each possible sequence of activities that make up a plan. These activity sequences are then summarized with temporal information in a temporal plan network, which consists of a network of all possible plans. Our approach to plan recognition begins with formulating the temporal plan network as a hidden Markov model. Next, we determine the most likely path using the Viterbi algorithm. Finally, we refer back to the temporal plan network to obtainpredicted future activities. Our research presents several innovations: First, we introduce a modified representation of temporal plan networks that incorporates probabilistic information into the state space and temporal representations. Second, we learn plans from actual data, such that the notion of an activity is not arbitrarily or manually defined, but is determined by the characteristics of the data. Third, we develop a recognition algorithm that can perform recognition continuously by making probabilistic updates. Finally, our recognizer not only identifies previously executed activities, but also predicts future activities based on the plan network. We demonstrate the capabilities of our algorithms on motion capture data. Our results show that the plan learning algorithm is able to generate reasonable temporal plan networks, depending on the dimensions of the training data and the recognition resolution used. The plan recognition algorithm is also successful in recognizing the correct activity sequences in the temporal plan network corresponding to the observed test data. SM thesis

Learning and recognition of hybrid manipulation tasks in variable environments using probabilistic flow tubes

2012年8月23日 00:00:00 GMT

Learning and recognition of hybrid manipulation tasks in variable environments using probabilistic flow tubes Dong, Shuonan Robots can act as proxies for human operators in environments where a human operator is not present or cannot directly perform a task, such as in dangerous or remote situations. Teleoperation is a common interface for controlling robots that are designed to be human proxies. Unfortunately, teleoperation may fail to preserve the natural fluidity of human motions due to interface limitations such as communication delays, non-immersive sensing, and controller uncertainty. I envision a robot that can learn a set of motions that a teleoperator commonly performs, so that it can autonomously execute routine tasks or recognize a user's motion in real time. Tasks can be either primitive activities or compound plans. During online operation, the robot can recognize a user's teleoperated motions on the fly and offer real-time assistance, for example, by autonomously executing the remainder of the task. I realize this vision by addressing three main problems: (1) learning primitive activities by identifying significant features of the example motions and generalizing the behaviors from user demonstration trajectories; (2) recognizing activities in real time by determining the likelihood that a user is currently executing one of several learned activities; and (3) learning complex plans by generalizing a sequence of activities, through auto-segmentation and incremental learning of previously unknown activities. To solve these problems, I first present an approach to learning activities from human demonstration that (1) provides flexibility and robustness when encoding a user's demonstrated motions by using a novel representation called a probabilistic flow tube, and (2) automatically determines the relevant features of a motion so that they can be preserved during autonomous execution in new situations. I next introduce an approach to real-time motion recognition that (1) uses temporal information to successfully model motions that may be non-Markovian, (2) provides fast real-time recognition of motions in progress by using an incremental temporal alignment approach, and (3) leverages the probabilistic flow tube representation to ensure robustness during recognition against varying environment states. Finally, I develop an approach to learn combinations of activities that (1) automatically determines where activities should be segmented in a sequence and (2) learns previously unknown activities on the fly. I demonstrate the results of autonomously executing motions learned by my approach on two different robotic platforms supporting user-teleoperated manipulation tasks in a variety of environments. I also present the results of real-time recognition in different scenarios, including a robotic hardware platform. Systematic testing in a two-dimensional environment shows up to a 27% improvement in activity recognition rates over prior art, while maintaining average computing times for incremental recognition of less than half of human reaction time. PhD thesis

Risk-minimizing program execution in robotic domains

2012年2月02日 00:00:00 GMT

Risk-minimizing program execution in robotic domains Effinger, Robert In this thesis, we argue that autonomous robots operating in hostile and uncertain environments can improve robustness by computing and reasoning explicitly about risk. Autonomous robots with a keen sensitivity to risk can be trusted with critical missions, such as exploring deep space and assisting on the battlefield. We introduce a novel, risk-minimizing approach to program execution that utilizes program flexibility and estimation of risk in order to make runtime decisions that minimize the probability of program failure. Our risk-minimizing executive, called Murphy, utilizes two forms of program flexibility, 1) flexible scheduling of activity timing, and 2) redundant choice between subprocedures, in order to minimize two forms of program risk, 1) exceptions arising from activity failures, and 2) exceptions arising from timing constraint violations in a program. Murphy takes two inputs, a program written in a nondeterministic variant of the Reactive Model-based Programming Language (RMPL) and a set of stochastic activity failure models, one for each activity in a program, and computes two outputs, a risk-minimizing decision policy and value function. The decision policy informs Murphy which decisions to make at runtime in order to minimize risk, while the value function quantifies risk. In order to execute with low latency, Murphy computes the decision policy and value function offline, as a compilation step prior to program execution. In this thesis, we develop three approaches to RMPL program execution. First, we develop an approach that is guaranteed to minimize risk. For this approach, we reason probabilistically about risk by framing program execution as a Markov Decision Process (MDP). Next, we develop an approach that avoids risk altogether. For this approach, we frame program execution as a novel form of constraint-based temporal reasoning. Finally, we develop an execution approach that trades optimality in risk avoidance for tractability. For this approach, we leverage prior work in hierarchical decomposition of MDPs in order to mitigate complexity. We benchmark the tractability of each approach on a set of representative RMPL programs, and we demonstrate the applicability of the approach on a humanoid robot simulator. PhD thesis

Optimal Temporal Planning at Reactive Time Scales via Dynamic Backtracking Branch and Bound

2006年8月25日 00:00:00 GMT

Optimal Temporal Planning at Reactive Time Scales via Dynamic Backtracking Branch and Bound Effinger, Robert Autonomous robots are being considered for increasingly capable roles in our society, such as urban search and rescue, automation for assisted living, and lunar habitat construction. To fulfill these roles, teams of autonomous robots will need to cooperate together to accomplish complex mission objectives in uncertain and dynamic environments. In these environments, autonomous robots face a host of new challenges, such as responding robustly to timing uncertainties and perturbations, task and coordination failures, and equipment malfunctions. In order to address these challenges, this thesis advocates a novel planning approach, called temporally-flexible contingent planning. A temporally-flexible contingent plan is a compact encoding of methods for achieving the mission objectives which incorporates robustness through flexible task durations, redundant methods, constraints on when methods are applicable, and preferences between methods. This approach enables robots to adapt to unexpected changes on-the-fly by selecting alternative methods at runtime in order to satisfy as best possible the mission objectives. The drawback to this approach, however, is the computational overhead involved in selecting alternative methods at runtime in response to changes. If a robot takes too long to select a new plan, it could fail to achieve its near-term mission objectives and potentially incur damage. To alleviate this problem, and extend the range of applicability of temporally-flexible contingent planning to more demanding real-time systems, this thesis proposes a temporally-flexible contingent plan executive that selects new methods quickly and optimally in response to changes in a robot's health and environment. We enable fast and optimal method selection through two complimentary approaches. First, we frame optimal method selection as a constraint satisfaction problem (CSP) variant, called an Optimal Conditional CSP (OCCSP). Second, we extend fast CSP search algorithms, such as Dynamic Backtracking and Branch-and-Bound Search, to solve OCCSPs. Experiments on an autonomous rover test-bed and on randomly generated plans show that these contributions significantly improve the speed at which robots perform optimal method selection in response to changes in their health status and environment. SM thesis

Fast, Approximate State Estimation of Concurrent Probabilistic Hybrid Automata

2013年12月11日 00:00:00 GMT

Fast, Approximate State Estimation of Concurrent Probabilistic Hybrid Automata Timmons, Eric It is an undeniable fact that autonomous systems are simultaneously becoming more common place, more complex, and deployed in more inhospitable environments. Examples include smart homes, smart cars, Mars rovers, unmanned aerial vehicles, and autonomous underwater vehicles. A common theme that all of these autonomous systems share is that in order to appropriately control them and prevent mission failure, they must be able to quickly estimate their internal state and the state of the world. A natural representation of many real world systems is to describe them in terms of a mixture of continuous and discrete variables. Unfortunately, hybrid estimation is typically intractable due to the large space of possible assignments to the discrete variables. In this thesis, we investigate how to incorporate conflict directed techniques from the consistency-based, model-based diagnosis community into a hybrid framework that is no longer purely consistency based. We introduce a novel search algorithm, A∗ with Bounding Conflicts, that uses conflicts to not only record infeasiblilities, but also learn where in the search space the heuristic function provided to the A∗ search is weak (possibly due to heavy to moderate sensor or process noise). Additionally, we describe a hybrid state estimation algorithm that uses this new search to perform estimation on hybrid discrete/continuous systems. SM thesis

Decision Uncertainty Minimization and Autonomous Information Gathering

2013年8月22日 00:00:00 GMT

Decision Uncertainty Minimization and Autonomous Information Gathering Bush, Lawrence A. M. Over the past several decades, technologies for remote sensing and exploration have be- come increasingly powerful but continue to face limitations in the areas of information gathering and analysis. These limitations affect technologies that use autonomous agents, which are devices that can make routine decisions independent of operator instructions. Bandwidth and other communications limitation require that autonomous differentiate between relevant and irrelevant information in a computationally efficient manner.This thesis presents a novel approach to this problem by framing it as an adaptive sensing problem. Adaptive sensing allows agents to modify their information collection strategies in response to the information gathered in real time. We developed and tested optimization algorithms that apply information guides to Monte Carlo planners. Information guides provide a mechanism by which the algorithms may blend online (realtime) and offline (previously simulated) planning in order to incorporate uncertainty into the decision- making process. This greatly reduces computational operations as well as decisional and communications overhead. We begin by introducing a 3-level hierarchy that visualizes adaptive sensing at synoptic (global), mesoscale (intermediate) and microscale (close-up) levels (a spatial hierarchy). We then introduce new algorithms for decision uncertainty minimization (DUM) and representational uncertainty minimization (RUM). Finally, we demonstrate the utility of this approach to real-world sensing problems, including bathymetric mapping and disaster relief. We also examine its potential in space exploration tasks by describing its use in a hypothetical aerial exploration of Mars. Our ultimate goal is to facilitate future large-scale missions to extraterrestrial objects for the purposes of scientific advancement and human exploration. PhD thesis

Delay Controllability: Multi-Agent Coordination under Communication Delay

2018年1月29日 00:00:00 GMT

Delay Controllability: Multi-Agent Coordination under Communication Delay Bhargava, Nikhil; Muise, Christian; Vaquero, Tiago; Williams, Brian Simple Temporal Networks with Uncertainty provide a useful framework for modeling temporal constraints and, importantly, for modeling actions with uncertain durations. To determine whether we can construct a schedule for a given network, we typically consider one of two types of controllability: dynamic or strong. These controllability checks have strict conditions on how uncertainty is resolved; uncertain outcomes are either recognized immediately or not at all. In this paper, we introduce delay controllability, a novel generalization of both strong and dynamic controllability that additionally exposes a large range of controllability classes in between. To do so, we use a delay function to parameterize our controllability checking. This delay function represents the difference between when an event happens and the time that it is observed. We also provide a single unified algorithm for checking delay controllability that runs in O(n^3) time, matching the best known runtime for dynamic controllability, which we use to motivate the decision to generalize dynamic and strong controllability. We conclude by providing an empirical evaluation of delay controllability, demonstrating its superior accuracy and practical efficiency as compared to other existing approximations. New version posted April 19, 2019 with slight tweaks to the algorithm and added clarity based on reviewer feedback.

Privacy and Security Risks for National Health Records Systems

2018年1月24日 00:00:00 GMT

Privacy and Security Risks for National Health Records Systems Alawaji, Ahmed; Sollins, Karen A review of national health records (NEHR) systems shows that privacy and security risks have a profound impact on the success of such projects. Countries have different approaches when dealing with privacy and security considerations. The aims of this study were to explore how governments can design secure national health records systems. To do that systematically, we developed a framework to analyze NEHR systems. We then applied the framework to investigate the privacy and security risks in these systems. The studied systems demonstrate that getting privacy and security right have a considerable impact on the success of NEHR projects. Also, our study reveals that the healthcare system structure has a substantial impact on the adoption and usage rates of the system. The studied cases uncover many opportunities for improving privacy and security measures in future projects. The framework demonstrates the utility of applying it to the three cases. SM thesis

Generating Component-based Supervised Learning Programs From Crowdsourced Examples

2017年12月21日 00:00:00 GMT

Generating Component-based Supervised Learning Programs From Crowdsourced Examples Cambronero, Jose; Rinard, Martin We present CrowdLearn, a new system that processes an existing corpus of crowdsourced machine learning programs to learn how to generate effective pipelines for solving supervised machine learning problems. CrowdLearn uses a probabilistic model of program likelihood, conditioned on the current sequence of pipeline components and on the characteristics of the input data to the next component in the pipeline, to predict candidate pipelines. Our results highlight the effectiveness of this technique in leveraging existing crowdsourced programs to generate pipelines that work well on a range of supervised learning problems.

Typesafety for Explicitly-Coded Probabilistic Inference Procedures

2017年11月09日 00:00:00 GMT

Typesafety for Explicitly-Coded Probabilistic Inference Procedures Atkinson, Eric; Carbin, Michael Researchers have recently proposed several systems that ease the process of developing Bayesian probabilistic inference algorithms. These include systems for automatic inference algorithm synthesis as well as stronger abstractions for manual algorithm development. However, existing systems whose performance relies on the developer manually constructing a part of the inference algorithm have limited support for reasoning about the correctness of the resulting algorithm. In this paper, we present Shuffle, a programming language for developing manual inference algorithms that enforces 1) the basic rules of probability theory and 2) statistical dependencies of the algorithm's corresponding probabilistic model. We have used Shuffle to develop inference algorithms for several standard probabilistic models. Our results demonstrate that Shuffle enables a developer to deliver performant implementations of these algorithms with the added benefit of Shuffle's correctness guarantees.

The Interval Programming Model Solution Algorithm Experimentation Tools and Results

2017年9月01日 00:00:00 GMT

The Interval Programming Model Solution Algorithm Experimentation Tools and Results Benjamin, Michael R. Interval programming (IvP) is model for representing multi-objective optimization problems along with a set of solution algorithms. This paper describes a set of IvP solution experiments run over randomly generated problem instances, using five different versions of the Recursive Interval Programming ALgorithm (RIPAL). The final version is the algorithm used most extensively in practice, with the first four provided mostly for comparison as the final version is built up in complexity. The full details of the algorithms are outside the scope of this paper, with the focus here being the experimental results, and the software tools and technique used in generating the problem instances. Additional tools are described for facilitating the experiments, including visualization tools, and tools for generating the plots and tables shown in this document. All software tools are available under an open source license, and all problem instances reported here are also available online. This document is meant to supplement other discussions on the IvP model, algorithm, and IvP applications to provide the detail of reporting that would not be possible due to length restrictions of other papers.

Inference and Regeneration of Programs that Manipulate Relational Databases

2017年8月29日 00:00:00 GMT

Inference and Regeneration of Programs that Manipulate Relational Databases Shen, Jiasi; Rinard, Martin We present a new technique that infers models of programs that manipulate relational databases. This technique generates test databases and input commands, runs the program, then observes the resulting outputs and updated databases to infer the model. Because the technique works only with the externally observable inputs, outputs, and databases, it can infer the behavior of programs written in arbitrary languages using arbitrary coding styles and patterns. We also present a technique for automatically regenerating an implementation of the program based on the inferred model. The regenerator can produce a translated implementation in a different language and systematically include relevant security and error checks. We present results that illustrate the use of the technique to eliminate SQL injection vulnerabilities and the translation of applications from Java and Ruby on Rails to Python.

An Efficient Fill Estimation Algorithm for Sparse Matrices and Tensors in Blocked Formats

2017年6月09日 00:00:00 GMT

An Efficient Fill Estimation Algorithm for Sparse Matrices and Tensors in Blocked Formats Ahrens, Willow; Schiefer, Nicholas; Xu, Helen Tensors, linear-algebraic extensions of matrices in arbitrary dimensions, have numerous applications in computer science and computational science. Many tensors are sparse, containing more than 90% zero entries. Efficient algorithms can leverage sparsity to do less work, but the irregular locations of the nonzero entries pose challenges to performance engineers. Many tensor operations such as tensor-vector multiplications can be sped up substantially by breaking the tensor into equally sized blocks (only storing blocks which contain nonzeros) and performing operations in each block using carefully tuned code. However, selecting the best block size is computationally challenging. Previously, Vuduc et al. defined the fill of a sparse tensor to be the number of stored entries in the blocked format divided by the number of nonzero entries, and showed that the fill can be used as an effective heuristic to choose a good block size. However, they gave no accuracy bounds for their method for estimating the fill, and it is vulnerable to adversarial examples. In this paper, we present a sampling-based method for finding a (1 + epsilon)-approximation to the fill of an order N tensor for all block sizes less than B, with probability at least 1 - delta, using O(B^(2N) log(B^N / delta) / epsilon^2) samples for each block size. We introduce an efficient routine to sample for all B^N block sizes at once in O(N B^N) time. We extend our concentration bounds to a more efficient bound based on sampling without replacement, using the recent Hoeffding-Serfling inequality. We then implement our algorithm and compare our scheme to that of Vuduc, as implemented in the Optimized Sparse Kernel Interface (OSKI) library. We find that our algorithm provides faster estimates of the fill at all accuracy levels, providing evidence that this is both a theoretical and practical improvement. Our code is available under the BSD 3-clause license at https://github.com/peterahrens/FillEstimation.

Multi-Unit Auction Revenue with Possibilistic Beliefs

2017年6月05日 00:00:00 GMT

Multi-Unit Auction Revenue with Possibilistic Beliefs Micali, Silvio; Vlachos, Georgios The revenue of traditional auction mechanisms is benchmarked solely against the players' own valuations, despite the fact that they may also have valuable beliefs about each other's valuations. Not much is known about generating revenue in auctions of multiple identical copies of a same good. (In particular the celebrated Vickrey mechanism has no revenue guarantees.) For such auctions, we (1) put forward an attractive revenue benchmark, based on the players' possibilistic about each other, and (2) construct a mechanism that achieves such benchmark, assuming that the players are two-level rational (where the rationality is in the sense of Aumann).

Autonomous COLREGS Modes and Velocity Functions

2017年5月16日 00:00:00 GMT

Autonomous COLREGS Modes and Velocity Functions Benjamin, Michael R. This paper concerns an implementation of an autonomy system for unmanned surface vessels operating in accordance with the Coast Guard Collision Regulations (COLREGS). The autonomy system is implemented by associating a dedicated ownship behavior module for each contact for collision avoidance. For each behavior, a mode determination is made based on the COLREGS rules, ownship position and trajectory, and the contact position and trajectory. Based on the mode, an appropriate objective function is generated, over the set of possible ownship maneuvers, to bias the vehicle in accordance with the COLREGS. The focus on this paper is solely on (a) the mode determination algorithms, (b) the requisite ownship and contact terms regarding position, trajectory and relative position utilized in the mode determination algorithms, and (c) the form and equations used in making the objective functions associated with each mode.

Automatic Inference of Code Transforms and Search Spaces for Automatic Patch Generation Systems

2016年7月08日 00:00:00 GMT

Automatic Inference of Code Transforms and Search Spaces for Automatic Patch Generation Systems Long, Fan; Amidon, Peter; Rinard, Martin We present a new system, Genesis, that processes sets of human patches to automatically infer code transforms and search spaces for automatic patch generation. We present results that characterize the effectiveness of the Genesis inference algorithms and the resulting complete Genesis patch generation system working with real-world patches and errors collected from top 1000 github Java software development projects. To the best of our knowledge, Genesis is the first system to automatically infer patch generation transforms or candidate patch search spaces from successful patches.

Planning Robust Strategies for Constructing Multi-object Arrangements

2017年1月30日 00:00:00 GMT

Planning Robust Strategies for Constructing Multi-object Arrangements Anders, Ariel; Kaelbling, Leslie; Lozano-Perez, Tomas A crucial challenge in robotics is achieving reliable results in spite of sensing and control uncertainty. A prominent strategy for dealing with uncertainty is to construct a feedback policy, where actions are chosen as a function of the current state estimate. However, constructing such policies is computationally very difficult. An alternative strategy is conformant planning which finds open-loop action sequences that achieve the goal for all input states and action outcomes. In this work, we investigate the conformant planning approach to robot manipulation. In particular, we tackle the problem of pushing multiple objects simultaneously to achieve a specified arrangement. Conformant planning is a belief-state planning problem. A belief state is the set of all possible states of the world, and the goal is to find a sequence of actions that will bring an initial belief state to a goal belief state To do forward belief-state planning, we created a deterministic belief-state transition model from supervised learning based on physics simulations. A key pitfall in conformant planning is that the complexity of the belief state tends to increase with each operation, making it increasingly harder to compute the effect of actions. This work explores the idea that we can construct conformant plans for robot manipulation by only using actions resulting in compact belief states.

Inference and Regeneration of Programs that Store and Retrieve Data

2017年4月24日 00:00:00 GMT

Inference and Regeneration of Programs that Store and Retrieve Data Rinard, Martin; Shen, Jiasi As modern computation platforms become increasingly complex, their programming interfaces are increasingly difficult to use. This complexity is especially inappropriate given the relatively simple core functionality that many of the computations implement. We present a new approach for obtaining so ware that executes on modern computing platforms with complex programming interfaces. Our approach starts with a simple seed program, written in the language of the developer's choice, that implements the desired core functionality. It then systematically generates inputs and observes the resulting outputs to learn the core functionality. It finally automatically regenerates new code that implements the learned core functionality on the target computing platform. This regenerated code contains both (a) boilerplate code for the complex programming interfaces that the target computing platform presents and (b) systematic error and vulnerability checking code that makes the new implementations robust and secure. By providing a productive new mechanism for capturing and encapsulating knowledge about how to use modern complex interfaces, this new approach promises to greatly reduce the developer effort required to obtain secure, robust so ware that executes on modern computing platforms.

On the Non-Existence of Blockwise 2-Local PRGs with Applications to Indistinguishability Obfuscation

2017年4月06日 00:00:00 GMT

On the Non-Existence of Blockwise 2-Local PRGs with Applications to Indistinguishability Obfuscation Lombardi, Alex; Vaikuntanathan, Vinod Lin and Tessaro (Eprint 2017/250) recently proposed indistinguishability obfuscation and functional encryption candidates and proved their security based on a standard assumption on bilinear maps and a non-standard assumption on ``Goldreich-like'' pseudorandom generators (PRG). In a nutshell, they require the existence of pseudo-random generators $G:\Sigma^n \to \{0,1\}^m$ for some $\mathsf{poly}(n)$-size alphabet $\Sigma$ where each output bit depends on at most two input alphabet symbols, and which achieve sufficiently large stretch. We show a polynomial-time attack against such generators. Our attack uses tools from the literature on two-source extractors (Chor and Goldreich, SICOMP 1988) and efficient refutation of 2-CSPs over large alphabets (Allen, O'Donnell and Witmer, FOCS 2015). Finally, we propose new ways to instantiate the Lin-Tessaro construction that do not immediately fall to our attacks. While we cannot say with any confidence that these modifications are secure, they certainly deserve further cryptanalysis.

Optimal and Player-Replaceable Consensus with an Honest Majority

2017年3月31日 00:00:00 GMT

Optimal and Player-Replaceable Consensus with an Honest Majority Micali, Silvio; Vaikuntanathan, Vinod We construct a Byzantine Agreement protocol that tolerates t < n/2 corruptions, is very efficient in terms of the number of rounds and the number of bits of communication, and satisfies a strong notion of robustness called player replaceability (defined in [Mic16]). We provide an analysis of our protocol when executed on real-world networks such as the ones employed in the bitcoin protocol.

The Tensor Algebra Compiler

2017年2月17日 00:00:00 GMT

The Tensor Algebra Compiler Kjolstad, Fredrik; Kamil, Shoaib; Chou, Stephen; Lugato, David; Amarasinghe, Saman Tensor and linear algebra is pervasive in data analytics and the physical sciences. Often the tensors, matrices or even vectors are sparse. Computing expressions involving a mix of sparse and dense tensors, matrices and vectors requires writing kernels for every operation and combination of formats of interest. The number of possibilities is infinite, which makes it impossible to write library code for all. This problem cries out for a compiler approach. This paper presents a new technique that compiles compound tensor algebra expressions combined with descriptions of tensor formats into efficient loops. The technique is evaluated in a prototype compiler called taco, demonstrating competitive performance to best-in-class hand-written codes for tensor and matrix operations.

Collaborative Diagnosis of Over-Subscribed Temporal Plans

2016年10月14日 00:00:00 GMT

Collaborative Diagnosis of Over-Subscribed Temporal Plans Yu, Peng Over-subscription, that is, being assigned too many tasks or requirements that are too demanding, is commonly encountered in temporal planning problems. As human beings, we often want to do more than we can, ask for things that may not be available, while underestimating how long it takes to perform each task. It is often difficult for us to detect the causes of failure in such situations and then find resolutions that are effective. We can greatly benefit from tools that assist us by looking out for these plan failures, by identifying their root causes, and by proposing preferred resolutions to these failures that lead to feasible plans. In recent literature, several approaches have been developed to resolve such over-subscribed problems, which are often framed as over-constrained scheduling, configuration design or optimal planning problems. Most of them take an all-or-nothing approach, in which over-subscription is resolved through suspending constraints or dropping goals. While helpful, in real-world scenarios, we often want to preserve our plan goals as much possible. As human beings, we know that slightly weakening the requirements of a travel plan, or replacing one of its destinations with an alternative one is often sufficient to resolve an over-subscription problem, no matter if the requirement being weakened is the duration of a deep-sea survey being planned for, or the restaurant cuisine for a dinner date. The goal of this thesis is to develop domain independent relaxation algorithms that perform this type of slight weakening of constraints, which we will formalize as continuous relaxation, and to embody them in a computational aid, Uhura, that performs tasks akin to an experienced travel agent or ocean scientists. In over-subscribed situations, Uhura helps us diagnose the causes of failure, suggests alternative plans, and collaborates with us in order to resolve conflicting requirements in the most preferred way. Most importantly, the algorithms underlying Uhura supports the weakening, instead of suspending, of constraints and variable domains in a temporally flexible plan. The contribution of this thesis is two-fold. First, we developed an algorithmic framework, called Best-first Conflict-Directed Relaxation (BCDR), for performing plan relaxation. Second, we use the BCDR framework to perform relaxation for several different families of plan representations involving different types of constraints. These include temporal constraints, chance constraints and variable domain constraints, and we incorporate several specialized conflict detection and resolution algorithms in support of the continuous weakening of them. The key idea behind BCDR's approach to continuous relaxation is to generalize the concepts of discrete conflicts and relaxations, first introduced by the model-based diagnosis community, to hybrid conflicts and relaxations, which denote minimal inconsistencies and minimal relaxations to both discrete and continuous relaxable constraints. PhD thesis

SE-Sync: A Certifiably Correct Algorithm for Synchronization over the Special Euclidean Group

2017年2月05日 00:00:00 GMT

SE-Sync: A Certifiably Correct Algorithm for Synchronization over the Special Euclidean Group Rosen, David M.; Carlone, Luca; Bandeira, Afonso S.; Leonard, John J. Many important geometric estimation problems naturally take the form of synchronization over the special Euclidean group: estimate the values of a set of unknown poses given noisy measurements of a subset of their pairwise relative transforms. Examples of this class include the foundational problems of pose-graph simultaneous localization and mapping (SLAM) (in robotics), camera motion estimation (in computer vision), and sensor network localization (in distributed sensing), among others. This inference problem is typically formulated as a nonconvex maximum-likelihood estimation that is computationally hard to solve in general. Nevertheless, in this paper we present an algorithm that is able to efficiently recover certifiably globally optimal solutions of the special Euclidean synchronization problem in a non-adversarial noise regime. The crux of our approach is the development of a semidefinite relaxation of the maximum-likelihood estimation whose minimizer provides an exact MLE so long as the magnitude of the noise corrupting the available measurements falls below a certain critical threshold; furthermore, whenever exactness obtains, it is possible to verify this fact a posteriori, thereby certifying the optimality of the recovered estimate. We develop a specialized optimization scheme for solving large-scale instances of this semidefinite relaxation by exploiting its low-rank, geometric, and graph-theoretic structure to reduce it to an equivalent optimization problem defined on a low-dimensional Riemannian manifold, and then design a Riemannian truncated-Newton trust-region method to solve this reduction efficiently. Finally, we combine this fast optimization approach with a simple rounding procedure to produce our algorithm, SE-Sync. Experimental evaluation on a variety of simulated and real-world pose-graph SLAM datasets shows that SE-Sync is capable of recovering certifiably globally optimal solutions when the available measurements are corrupted by noise up to an order of magnitude greater than that typically encountered in robotics and computer vision applications, and does so more than an order of magnitude faster than the Gauss-Newton-based approach that forms the basis of current state-of-the-art techniques.

Propositional and Activity Monitoring Using Qualitative Spatial Reasoning

2016年12月14日 00:00:00 GMT

Propositional and Activity Monitoring Using Qualitative Spatial Reasoning Lane, Spencer Dale Communication is the key to effective teamwork regardless of whether the team members are humans or machines. Much of the communication that makes human teams so effective is non-verbal; they are able to recognize the actions that the other team members are performing and take their own actions in order to assist. A robotic team member should be able to make the same inferences, observing the state of the environment and inferring what actions are being taken. In this thesis I introduce a novel approach to the combined problem of activity recognition and propositional monitoring. This approach breaks down the problem into smaller sub-tasks. First, the raw sensor input is parsed into simple, easy to understand primitive semantic relationships known as qualitative spatial relations (QSRs). These primitives are then combined to estimate the state of the world in the same language used by most planners, planning domain definition language (PDDL) propositions. Both the primitives and propositions are combined to infer the status of the actions that the human is taking. I describe an algorithm for solving each of these smaller problems and describe the modeling process for a variety of tasks from an abstracted electronic component assembly (ECA) scenario. I implemented this scenario on a robotic testbed and collected data of a human performing the example actions. SM thesis

Sound and Complete Runtime Security Monitor for Application Software

2016年12月15日 00:00:00 GMT

Sound and Complete Runtime Security Monitor for Application Software Khan, M. Taimoor; Serpanos, Dimitrios; Shrobe, Howard We present a run-time security monitor that detects both known and unknown cyber attacks by checking that the run-time behavior of the application is consistent with the expected behavior modeled by an application specification. This is crucial because, even if the implementation is consistent with its specification, the application may still be vulnerable due to flaws in the supporting infrastructure. This run-time security monitor is sound and complete, eliminating false alarms, as well as efficient, so that it does not limit run-time application performance and so that it supports real-time systems. Importantly, this monitor is readily applicable to both legacy and new system platforms.The security monitor takes as input the application specification and the application implementation, which may be expressed in different languages. The security monitor detects attacks by systematically comparing the application execution and specification behaviors at run-time, even though they operate at two different levels of abstraction. We define the denotational semantics of the specification language and prove that the monitor is sound and complete, i.e. if the application is consistent with its specification, the security monitor will produce no false alarms (soundness) and that it will detect any deviation of the application from the behavior sanctioned by the specification language (completeness). Importantly, the application specification language enables the description of known or potential attack plans, enabling not only attack detection but attack characterization as well.

Oort: User-Centric Cloud Storage with Global Queries

2016年12月08日 00:00:00 GMT

Oort: User-Centric Cloud Storage with Global Queries Chajed, Tej; Gjengset, Jon; Kaashoek, M. Frans; Mickens, James; Morris, Robert; Zeldovich, Nickolai In principle, the web should provide the perfect stage for user-generated content, allowing users to share their data seamlessly with other users across services and applications. In practice, the web fragments a user's data over many sites, each exposing only limited APIs for sharing. This paper describes Oort, a new cloud storage system that organizes data primarily by user rather than by application or web site. Oort allows users to choose which web software to use with their data and which other users to share it with, while giving applications powerful tools to query that data. Users rent space from providers that cooperate to provide a global, federated, general-purpose storage system. To support large-scale, multi-user applications such as Twitter and e-mail, Oort provides global queries that find and combine data from relevant users across all providers. Oort makes global query execution efficient by recognizing and merging similar queries issued by many users' application instances, largely eliminating the per-user factor in the global complexity of queries. Our evaluation predicts that an Oort implementation could handle traffic similar to that seen by Twitter using a hundred cooperating Oort servers, and that applications with other sharing patterns, like e-mail, can also be executed efficiently.

Data and Code for "Automatic Identification of Narrative Diegesis and Point of View"

2016年11月09日 00:00:00 GMT

Data and Code for "Automatic Identification of Narrative Diegesis and Point of View" Eisenberg, Joshua D.; Finlayson, Mark A. This archive contains the code and data for the workshop article "Automatic Identification of Narrative Diegesis and Point of View," published in 2016 in the 2nd Workshop for Computing News Storylines (CNewsStory 2016), co-located with EMNLP 2016 in Austin, TX. The root of the archive contains a README file which explains the archive contents. Furthermore, the archive can be imported directly into the Eclipse IDE as a project encapsulating the executable code required to reproduce the results of the paper; the code compiles with Java 1.8. The archive also contains a copy of the final version of the paper for reference.

Report on the 2015 NSF Workshop on Unified Annotation Tooling

2016年11月08日 00:00:00 GMT

Report on the 2015 NSF Workshop on Unified Annotation Tooling Finlayson, Mark Alan On March 30 & 31, 2015, an international group of twenty-three researchers with expertise in linguistic annotation convened in Sunny Isles Beach, Florida to discuss problems with and potential solutions for the state of linguistic annotation tooling. The participants comprised 14 researchers from the U.S. and 9 from outside the U.S., with 7 countries and 4 continents represented, and hailed from fields and specialties including computational linguistics, artificial intelligence, speech processing, multi-modal data processing, clinical & medical natural language processing, linguistics, documentary linguistics, sign-language linguistics, corpus linguistics, and the digital humanities. The motivating problem of the workshop was the balkanization of annotation tooling, namely, that even though linguistic annotation requires sophisticated tool support to efficiently generate high-quality data, the landscape of tools for the field is fractured, incompatible, inconsistent, and lacks key capabilities. The overall goal of the workshop was to chart the way forward, centering on five key questions: (1) What are the problems with current tool landscape? (2) What are the possible benefits of solving some or all of these problems? (3) What capabilities are most needed? (4) How should we go about implementing these capabilities? And, (5) How should we ensure longevity and sustainability of the solution? I surveyed the participants before their arrival, which provided significant raw material for ideas, and the workshop discussion itself resulted in identification of ten specific classes of problems, five sets of most-needed capabilities. Importantly, we identified annotation project managers in computational linguistics as the key recipients and users of any solution, thereby succinctly addressing questions about the scope and audience of potential solutions. We discussed management and sustainability of potential solutions at length. The participants agreed on sixteen recommendations for future work. This technical report contains a detailed discussion of all these topics, a point-by-point review of the discussion in the workshop as it unfolded, detailed information on the participants and their expertise, and the summarized data from the surveys.

Alpenhorn: Bootstrapping Secure Communication without Leaking Metadata

2016年10月05日 00:00:00 GMT

Alpenhorn: Bootstrapping Secure Communication without Leaking Metadata Lazar, David; Zeldovich, Nickolai Alpenhorn is the first system for initiating an encrypted connection between two users that provides strong privacy and forward secrecy guarantees for metadata (i.e., information about which users connected to each other) and that does not require out-of-band communication other than knowing the other user's Alpenhorn username (email address). This resolves a significant shortcoming in all prior works on private messaging, which assume an out-of-band key distribution mechanism. Alpenhorn's design builds on three ideas. First, Alpenhorn provides each user with an address book of friends that the user can call to establish a connection. Second, when a user adds a friend for the first time, Alpenhorn ensures the adversary does not learn the friend's identity, by using identity-based encryption in a novel wayto privately determine the friend's public key. Finally, when calling a friend, Alpenhorn ensures forward secrecy of metadata by storing pairwise shared secrets in friends' address books, and evolving them over time, using a new keywheel construction. Alpenhorn relies on a number of servers, but operates in an anytrust model, requiring just one of the servers to be honest. We implemented a prototype of Alpenhorn, and integrated it into the Vuvuzela private messaging system (which did not previously provide privacy or forward secrecy of metadata when initiating conversations). Experimental results show that Alpenhorn can scale to many users, supporting 10 million users on three Alpenhorn servers with an average call latency of 150 seconds and a client bandwidth overhead of 3.7 KB/sec.

Examining Key Mobility Resources through Denial of Service Attacks on proposed Global Name Resolution Services

2016年9月26日 00:00:00 GMT

Examining Key Mobility Resources through Denial of Service Attacks on proposed Global Name Resolution Services Rock, Colleen T. The problem we address in this thesis is to uncover the design elements in a network architecture design that may open it up to denial of service (DoS) attacks and to expose the tradeoffs in mitigating those DoS opportunities. We take as our candidate network architecture design the Future Internet Architecture project MobilityFirst. MobilityFirst's overarching goal, driven by increasingly available wireless communication, is the support of mobility in an Internet architecture. At its core, MobilityFirst separates identification from location, as distinct from the current Internet architecture, and postulates the existence of globally unique, flat identifiers. In order to support mobility in this context, it also postulates a global name resolution service (GNRS). In this thesis we examine three alternative designs for the GNRS and the opportunities they expose for DoS attacks. We consider each one in depth analytically. As an example, we then study one particular attack in depth and are forced to conclude that approaches to mitigating this attack would have significant negative impact on the support of mobility thus exposing the dilemma in such system design tradeoffs. MEng thesis

Flowtune: Flowlet Control for Datacenter Networks

2016年8月15日 00:00:00 GMT

Flowtune: Flowlet Control for Datacenter Networks Perry, Jonathan; Balakrishnan, Hari; Shah, Devavrat Rapid convergence to a desired allocation of network resources to endpoint traffic has been a long-standing challenge for packet-switched networks. The reason for this is that congestion control decisions are distributed across the endpoints, which vary their offered load in response to changes in application demand and network feedback on a packet-by-packet basis. We propose a different approach for datacenter networks, flowlet control, in which congestion control decisions are made at the granularity of a flowlet, not a packet. With flowlet control, allocations have to change only when flowlets arrive or leave. We have implemented this idea in a system called Flowtune using a centralized allocator that receives flowlet start and end notifications from endpoints. The allocator computes optimal rates using a new, fast method for network utility maximization, and updates endpoint congestion-control parameters. Experiments show that Flowtune outperforms DCTCP, pFabric, sfqCoDel, and XCP on tail packet delays in various settings, converging to optimal rates within a few packets rather than over several RTTs. Our implementation of Flowtune handles 10.4x more throughput per core and scales to 8x more cores than Fastpass, for an 83-fold throughput gain.

Automatic Inference of Code Transforms and Search Spaces for Automatic Patch Generation Systems

2016年7月08日 00:00:00 GMT

Evaluating Caching Mechanisms In Future Internet Architectures

2016年6月28日 00:00:00 GMT

Evaluating Caching Mechanisms In Future Internet Architectures Jing, Yuxin This thesis seeks to test and evaluate the effects of in-network storage in novel proposed Internet architectures in terms of their performance. In a world where more and more people are mobile and connected to the Internet, we look at how the added variable of user mobility can affect how these architectures perform under different loads. Evaluating the effects of in-network storage and caching in these novel architectures will provide another facet to understanding how viable of an alternative they would be to the current TCP/IP paradigm of today's Internet. In Named Data Networking, where the storage is used to directly cache content, we see its use of storage impact the locality of where things are, while in MobilityFirst, where storage is used to cache chunks to provide robust delivery, we look at how its different layers work together in a mobility event. MEng thesis

Modeling Network User Behavior: Various Approaches

2016年6月28日 00:00:00 GMT

Modeling Network User Behavior: Various Approaches Xu, Shidan This project involves learning to predict users' mobility within the network topology. Topological mobility, as opposed to physical mobility, can be substantial as a user switches from LTE to wifi network, while moving minimally physically. Our dataset consists of email IMAP logs as they document associated client IP addresses, as well as the clients' identifiers. Prediction for online mobility is of particular interest to the networks community. If we can predict online mobility with high probability, then new network architecture can be designed to optimize the caching system by minimizing resending packets. We used various approaches and techniques to model the user's behavior, including probabilistic programming, regression, neural nets, and clustering algorithms. We compare and contrast how models differ in their prediction accuracy, speed of convergence, and algorithmic complexity. MEng thesis

Towards Practical Theory: Bayesian Optimization and Optimal Exploration

2016年5月26日 00:00:00 GMT

Towards Practical Theory: Bayesian Optimization and Optimal Exploration Kawaguchi, Kenji This thesis discusses novel principles to improve the theoretical analyses of a class of methods, aiming to provide theoretically driven yet practically useful methods. The thesis focuses on a class of methods, called bound-based search, which includes several planning algorithms (e.g., the A* algorithm and the UCT algorithm), several optimization methods (e.g., Bayesian optimization and Lipschitz optimization), and some learning algorithms (e.g., PAC-MDP algorithms). For Bayesian optimization, this work solves an open problem and achieves an exponential convergence rate. For learning algorithms, this thesis proposes a new analysis framework, called PAC-RMDP, and improves the previous theoretical bounds. The PAC-RMDP framework also provides a unifying view of some previous near-Bayes optimal and PAC-MDP algorithms. All proposed algorithms derived on the basis of the new principles produced competitive results in our numerical experiments with standard benchmark tests. SM thesis

Deep Learning without Poor Local Minima

2016年5月23日 00:00:00 GMT

Deep Learning without Poor Local Minima Kawaguchi, Kenji In this paper, we prove a conjecture published in 1989 and also partially address an open problem announced at the Conference on Learning Theory (COLT) 2015. For an expected loss function of a deep nonlinear neural network, we prove the following statements under the independence assumption adopted from recent work: 1) the function is non-convex and non-concave, 2) every local minimum is a global minimum, 3) every critical point that is not a global minimum is a saddle point, and 4) the property of saddle points differs for shallow networks (with three layers) and deeper networks (with more than three layers). Moreover, we prove that the same four statements hold for deep linear neural networks with any depth, any widths and no unrealistic assumptions. As a result, we present an instance, for which we can answer to the following question: how difficult to directly train a deep model in theory? It is more difficult than the classical machine learning models (because of the non-convexity), but not too difficult (because of the nonexistence of poor local minima and the property of the saddle points). We note that even though we have advanced the theoretical foundations of deep learning, there is still a gap between theory and practice.

Delphi: A Software Controller for Mobile Network Selection

2016年2月25日 00:00:00 GMT

Delphi: A Software Controller for Mobile Network Selection Deng, Shuo; Sivaraman, Anirudh; Balakrishnan, Hari This paper presents Delphi, a mobile software controller that helps applications select the best network among available choices for their data transfers. Delphi optimizes a specified objective such as transfer completion time, or energy per byte transferred, or the monetary cost of a transfer. It has four components: a performance predictor that uses features gathered by a network monitor, and a traffic profiler to estimate transfer sizes near the start of a transfer, all fed into a network selector that uses the prediction and transfer size estimate to optimize an objective.For each transfer, Delphi either recommends the best single network to use, or recommends Multi-Path TCP (MPTCP), but crucially selects the network for MPTCP s primary subflow . The choice of primary subflow has a strong impact onthe transfer completion time, especially for short transfers.We designed and implemented Delphi in Linux. It requires no application modifications. Our evaluation shows that Delphi reduces application network transfer time by 46% for Web browsing and by 49% for video streaming, comparedwith Android s default policy of always using Wi-Fi when it is available. Delphi can also be configured to achieve high throughput while being battery-efficient: in this configuration, it achieves 1.9x the throughput of Android s default policy while only consuming 6% more energy.

An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems

2016年2月18日 00:00:00 GMT

An Analysis of the Search Spaces for Generate and Validate Patch Generation Systems Long, Fan; Rinard, Martin We present the first systematic analysis of the characteristics of patch search spaces for automatic patch generation systems. We analyze the search spaces of two current state-of- the-art systems, SPR and Prophet, with 16 different search space configurations. Our results are derived from an analysis of 1104 different search spaces and 768 patch generation executions. Together these experiments consumed over 9000 hours of CPU time on Amazon EC2.The analysis shows that 1) correct patches are sparse in the search spaces (typically at most one correct patch per search space per defect), 2) incorrect patches that nevertheless pass all of the test cases in the validation test suite are typically orders of magnitude more abundant, and 3) leveraging information other than the test suite is therefore critical for enabling the system to successfully isolate correct patches.We also characterize a key tradeoff in the structure of the search spaces. Larger and richer search spaces that contain correct patches for more defects can actually cause systems to find fewer, not more, correct patches. We identify two reasons for this phenomenon: 1) increased validation times because of the presence of more candidate patches and 2) more incorrect patches that pass the test suite and block the discovery of correct patches. These fundamental properties, which are all characterized for the first time in this paper, help explain why past systems often fail to generate correct patches and help identify challenges, opportunities, and productive future directions for the field.

Outlier Detection in Heterogeneous Datasets using Automatic Tuple Expansion

2016年2月08日 00:00:00 GMT

Outlier Detection in Heterogeneous Datasets using Automatic Tuple Expansion Pit-Claudel, Clément; Mariet, Zelda; Harding, Rachael; Madden, Sam Rapidly developing areas of information technology are generating massive amounts of data. Human errors, sensor failures, and other unforeseen circumstances unfortunately tend to undermine the quality and consistency of these datasets by introducing outliers -- data points that exhibit surprising behavior when compared to the rest of the data. Characterizing, locating, and in some cases eliminating these outliers offers interesting insight about the data under scrutiny and reinforces the confidence that one may have in conclusions drawn from otherwise noisy datasets. In this paper, we describe a tuple expansion procedure which reconstructs rich information from semantically poor SQL data types such as strings, integers, and floating point numbers. We then use this procedure as the foundation of a new user-guided outlier detection framework, dBoost, which relies on inference and statistical modeling of heterogeneous data to flag suspicious fields in database tuples. We show that this novel approach achieves good classification performance, both in traditional numerical datasets and in highly non-numerical contexts such as mostly textual datasets. Our implementation is publicly available, under version 3 of the GNU General Public License.

Initial report on Object Spreadsheets

2016年1月12日 00:00:00 GMT

Initial report on Object Spreadsheets McCutchen, Richard Matthew; Itzhaky, Shachar; Jackson, Daniel There is a growing demand for data-driven web applications that help automate organizational and business processes of low to medium complexity by letting users view and update structured data in controlled ways. We present Object Spreadsheets, an end-user development tool that combines a spreadsheet interface with a rich data model to help the process administrators build the logic for such applications themselves. Its all-in-one interface with immediate feedback has the potential to bring more complex tasks within reach of end-user developers, compared to existing approaches. Our data model is based on the structure of entity-relationship models and directly supports nested variable-size collections and object references, which are common in web applications but poorly accommodated by traditional spreadsheets. Object Spreadsheets has a formula language suited to the data model and supports stored procedures to specify the forms of updates that application users may make. Formulas can be used to assemble data in the exact structure in which it is to be shown in the application UI, simplifying the task of UI building; we intend for Object Spreadsheets to be integrated with a UI builder to provide a complete solution for application development. We describe our prototype implementation and several example applications we built to demonstrate the applicability of the tool.

Filtered Iterators For Safe and Robust Programs in RIFL

2015年12月27日 00:00:00 GMT

Filtered Iterators For Safe and Robust Programs in RIFL Shen, Jiasi; Rinard, Martin We present a new language construct, filtered iterators, for safe and robust input processing. Filtered iterators are designed to eliminate many common input-processing errors while enabling robust continued execution. The design is inspired by (a) observed common input-processing errors and (b) continued execution strategies that are implemented by developers fixing input validation errors. Filtered iterators decompose inputs into input units, atomically and automatically discarding units that trigger errors. Statistically significant results from a developer study highlight the difficulties that developers encounter when developing input-processing code using standard language constructs. These results also demonstrate the effectiveness of filtered iterators in eliminating many of these difficulties and enabling developers to produce safe and robust input-processing code.

Jenga: Harnessing Heterogeneous Memories through Reconfigurable Cache Hierarchies

2015年12月19日 00:00:00 GMT

Jenga: Harnessing Heterogeneous Memories through Reconfigurable Cache Hierarchies Beckmann, Nathan; Tsai, Po-An; Sanchez, Daniel Conventional memory systems are organized as a rigid hierarchy, with multiple levels of progressively larger and slower memories. Hierarchy allows a simple, fixed design to benefit a wide range of applications, because working sets settle at the smallest (and fastest) level they fit in. However, rigid hierarchies also cause significant overheads, because each level adds latency and energy even when it does not capture the working set. In emerging systems with heterogeneous memory technologies such as stacked DRAM, these overheads often limit performance and efficiency. We propose Jenga, a reconfigurable cache hierarchy that avoids these pathologies and approaches the performance of a hierarchy optimized for each application. Jenga monitors application behavior and dynamically builds virtual cache hierarchies out of heterogeneous, distributed cache banks. Jenga uses simple hardware support and a novel software runtime to configure virtual cache hierarchies. On a 36-core CMP with a 1 GB stacked-DRAM cache, Jenga outperforms a combination of state-of-the-art techniques by 10% on average and by up to 36%, and does so while saving energy, improving system-wide energy-delay product by 29% on average and by up to 96%.

Bridging Theory and Practice in Cache Replacement

2015年12月19日 00:00:00 GMT

Bridging Theory and Practice in Cache Replacement Beckmann, Nathan; Sanchez, Daniel Much prior work has studied processor cache replacement policies, but a large gap remains between theory and practice. The optimal policy (MIN) requires unobtainable knowledge of the future, and prior theoretically-grounded policies use reference models that do not match real programs. Meanwhile, practical policies are designed empirically. Lacking a strong theoretical foundation, they do not make the best use of the information available to them. This paper bridges theory and practice. We propose that practical policies should replace lines based on their economic value added (EVA), the difference of their expected hits from the average. We use Markov decision processes to show that EVA is optimal under some reasonable simplifications. We present an inexpensive, practical implementation of EVA and evaluate it exhaustively over many cache sizes. EVA outperforms prior practical policies and saves area at iso-performance. These results show that formalizing cache replacement yields practical benefits.

Cache Calculus: Modeling Caches through Differential Equations

2015年12月19日 00:00:00 GMT

Cache Calculus: Modeling Caches through Differential Equations Beckmann, Nathan; Sanchez, Daniel Caches are critical to performance, yet their behavior is hard to understand and model. In particular, prior work does not provide closed-form solutions of cache performance, i.e. simple expressions for the miss rate of a specific access pattern. Existing cache models instead use numerical methods that, unlike closed-form solutions, are computationally expensive and yield limited insight. We present cache calculus, a technique that models cache behavior as a system of ordinary differential equations, letting standard calculus techniques find simple and accurate solutions of cache performance for common access patterns.

Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory"

2015年12月02日 00:00:00 GMT

Supplementary materials for "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory" Finlayson, Mark Alan This archive contains the supplementary material for the journal article "ProppLearner: Deeply Annotating a Corpus of Russian Folktales to Enable the Machine Learning of a Russian Formalist Theory", published in the Journal of Digital Scholarship in the Humanities (DSH), ca. 2016.The archive contains several different types of files. First, it contains the annotation guides that were used to train the annotators. The guides are numbered to match the team numbers in Table 6. Included here are not only detailed guides for some layers, as produced by the original developers of the specification, but also our synopsis guides for each layer, which were used as a reference and further training material for the annotators. Also of interest are the general annotator and adjudicator training guides, which outline the general procedures followed by the teams when conducting annotation. Those who are organizing their own annotation projects may find this material useful.Second, the archive contains a comprehensive manifest, in Excel spreadsheet format, listing the word counts, sources, types, and titles (in both Russian and English) of all the texts that are part of the corpus. Finally, the archive contains the actual corpus data files, in Story Workbench format, an XML-encoded stand-off annotation scheme. The scheme is described in the file format specification file, also included in the archive. These files can be parsed with the aid of any normal XML reading software, or can be loaded and edited easily with the Story Workbench annotation tool, also freely available.

Representation Discovery for Kernel-Based Reinforcement Learning

2015年11月24日 00:00:00 GMT

Representation Discovery for Kernel-Based Reinforcement Learning Zewdie, Dawit H.; Konidaris, George Recent years have seen increased interest in non-parametric reinforcement learning. There are now practical kernel-based algorithms for approximating value functions; however, kernel regression requires that the underlying function being approximated be smooth on its domain. Few problems of interest satisfy this requirement in their natural representation. In this paper we define Value-Consistent Pseudometric (VCPM), the distance function corresponding to a transformation of the domain into a space where the target function is maximally smooth and thus well-approximated by kernel regression. We then present DKBRL, an iterative batch RL algorithm interleaving steps of Kernel-Based Reinforcement Learning and distance metric adjustment. We evaluate its performance on Acrobot and PinBall, continuous-space reinforcement learning domains with discontinuous value functions.

Dynamic Prefetching of Data Tiles for Interactive Visualization

2015年10月19日 00:00:00 GMT

Dynamic Prefetching of Data Tiles for Interactive Visualization Battle, Leilani; Chang, Remco; Stonebraker, Michael In this paper, we present ForeCache, a general-purpose tool for exploratory browsing of large datasets. ForeCache utilizes a client-server architecture, where the user interacts with a lightweight client-side interface to browse datasets, and the data to be browsed is retrieved from a DBMS running on a back-end server. We assume a detail-on-demand browsing paradigm, and optimize the back-end support for this paradigm by inserting a separate middleware layer in front of the DBMS. To improve response times, the middleware layer fetches data ahead of the user as she explores a dataset. We consider two different mechanisms for prefetching: (a) learning what to fetch from the user's recent movements, and (b) using data characteristics (e.g., histograms) to find data similar to what the user has viewed in the past. We incorporate these mechanisms into a single prediction engine that adjusts its prediction strategies over time, based on changes in the user's behavior. We evaluated our prediction engine with a user study, and found that our dynamic prefetching strategy provides: (1) significant improvements in overall latency when compared with non-prefetching systems (430% improvement); and (2) substantial improvements in both prediction accuracy (25% improvement) and latency (88% improvement) relative to existing prefetching techniques.

Big Data Privacy Scenarios

2015年10月01日 00:00:00 GMT

Big Data Privacy Scenarios Bruce, Elizabeth; Sollins, Karen; Vernon, Mona; Weitzner, Danny This paper is the first in a series on privacy in Big Data. As an outgrowth of a series of workshops on the topic, the Big Data Privacy Working Group undertook a study of a series of use scenarios to highlight the challenges to privacy that arise in the Big Data arena. This is a report on those scenarios. The deeper question explored by this exercise is what is distinctive about privacy in the context of Big Data. In addition, we discuss an initial list of issues for privacy that derive specifically from the nature of Big Data. These derive from observations across the real world scenarios and use cases explored in this project as well as wider reading and discussions:* Scale: The sheer size of the datasets leads to challenges in creating, managing and applying privacy policies.* Diversity: The increased likelihood of more and more diverse participants in Big Data collection, management, and use, leads to differing agendas and objectives. By nature, this is likely to lead to contradictory agendas and objectives.* Integration: With increased data management technologies (e.g. cloud services, data lakes, and so forth), integration across datasets, with new and often surprising opportunities for cross-product inferences, will also come new information about individuals and their behaviors.* Impact on secondary participants: Because many pieces of information are reflective of not only the targeted subject, but secondary, often unattended, participants, the inferences and resulting information will increasingly be reflective of other people, not originally considered as the subject of privacy concerns and approaches.* Need for emergent policies for emergent information: As inferences over merged data sets occur, emergent information or understanding will occur. Although each unique data set may have existing privacy policies and enforcement mechanisms, it is not clear that it is possible to develop the requisite and appropriate emerged privacy policies and appropriate enforcement of them automatically.

Designing a Context-Sensitive Context Detection Service for Mobile Devices

2015年9月24日 00:00:00 GMT

Designing a Context-Sensitive Context Detection Service for Mobile Devices Chen, Tiffany Yu-Han; Sivaraman, Anirudh; Das, Somak; Ravindranath, Lenin; Balakrishnan, Hari This paper describes the design, implementation, and evaluation of Amoeba, a context-sensitive context detection service for mobile devices. Amoeba exports an API that allows a client to express interest in one or more context types (activity, indoor/outdoor, and entry/exit to/from named regions), subscribe to specific modes within each context (e.g., "walking" or "running", but no other activity), and specify a response latency (i.e., how often the client is notified). Each context has a detector that returns its estimate of the mode. The detectors take both the desired subscriptions and the current context detection into account, adjusting both the types of sensors and the sampling rates to achieve high accuracy and low energy consumption. We have implemented Amoeba on Android. Experiments with Amoeba on 45+ hours of data show that our activity detector achieves an accuracy between 92% and 99%, outperforming previous proposals like UCLA* (59%), EEMSS (82%) and SociableSense (72%), while consuming 4 to ×ばつ less energy.

Network Maximal Correlation

2015年9月21日 00:00:00 GMT

Network Maximal Correlation Feizi, Soheil; Makhdoumi, Ali; Duffy, Ken; Kellis, Manolis; Medard, Muriel Identifying nonlinear relationships in large datasets is a daunting task particularly when the form of the nonlinearity is unknown. Here, we introduce Network Maximal Correlation (NMC) as a fundamental measure to capture nonlinear associations in networks without the knowledge of underlying nonlinearity shapes. NMC infers, possibly nonlinear, transformations of variables with zero means and unit variances by maximizing total nonlinear correlation over the underlying network. For the case of having two variables, NMC is equivalent to the standard Maximal Correlation. We characterize a solution of the NMC optimization using geometric properties of Hilbert spaces for both discrete and jointly Gaussian variables. For discrete random variables, we show that the NMC optimization is an instance of the Maximum Correlation Problem and provide necessary conditions for its global optimal solution. Moreover, we propose an efficient algorithm based on Alternating Conditional Expectation (ACE) which converges to a local NMC optimum. For this algorithm, we provide guidelines for choosing appropriate starting points to jump out of local maximizers. We also propose a distributed algorithm to compute a 1-$\epsilon$ approximation of the NMC value for large and dense graphs using graph partitioning. For jointly Gaussian variables, under some conditions, we show that the NMC optimization can be simplified to a Max-Cut problem, where we provide conditions under which an NMC solution can be computed exactly. Under some general conditions, we show that NMC can infer the underlying graphical model for functions of latent jointly Gaussian variables. These functions are unknown, bijective, and can be nonlinear. This result broadens the family of continuous distributions whose graphical models can be characterized efficiently. We illustrate the robustness of NMC in real world applications by showing its continuity with respect to small perturbations of joint distributions. We also show that sample NMC (NMC computed using empirical distributions) converges exponentially fast to the true NMC value. Finally, we apply NMC to different cancer datasets including breast, kidney and liver cancers, and show that NMC infers gene modules that are significantly associated with survival times of individuals while they are not detected using linear association measures.

Prophet: Automatic Patch Generation via Learning from Successful Patches

2015年7月13日 00:00:00 GMT

Prophet: Automatic Patch Generation via Learning from Successful Patches Long, Fan; Rinard, Martin We present Prophet, a novel patch generation system that learns a probabilistic model over candidate patches from a database of past successful patches. Prophet defines the probabilistic model as the combination of a distribution over program points based on defect localization algorithms and a parametrized log-linear distribution over modification operations. It then learns the model parameters via maximum log-likelihood, which identifies important characteristics of the previous successful patches in the database. For a new defect, Prophet generates a search space that contains many candidate patches, applies the learned model to prioritize those potentially correct patches that are consistent with the identified successful patch characteristics, and then validates the candidate patches with a user supplied test suite. The experimental results indicate that these techniques enable Prophet to generate correct patches for 15 out of 69 real-world defects in eight open source projects. The previous state of the art generate and validate system, which uses a set of hand-code heuristics to prioritize the search, generates correct patches for 11 of these same 69 defects.

Keys Under Doormats: Mandating insecurity by requiring government access to all data and communications

2015年7月06日 00:00:00 GMT

Keys Under Doormats: Mandating insecurity by requiring government access to all data and communications Abelson, Harold; Anderson, Ross; Bellovin, Steven M.; Benaloh, Josh; Blaze, Matt; Diffie, Whitfield; Gilmore, John; Green, Matthew; Landau, Susan; Neumann, Peter G.; Rivest, Ronald L.; Schiller, Jeffrey I.; Schneier, Bruce; Specter, Michael; Weitzner, Daniel J. Twenty years ago, law enforcement organizations lobbied to require data and communication services to engineer their products to guarantee law enforcement access to all data. After lengthy debate and vigorous predictions of enforcement channels going dark, these attempts to regulate the emerging Internet were abandoned. In the intervening years, innovation on the Internet flourished, and law enforcement agencies found new and more effective means of accessing vastly larger quantities of data. Today we are again hearing calls for regulation to mandate the provision of exceptional access mechanisms. In this report, a group of computer scientists and security experts, many of whom participated in a 1997 study of these same topics, has convened to explore the likely effects of imposing extraordinary access mandates. We have found that the damage that could be caused by law enforcement exceptional access requirements would be even greater today than it would have been 20 years ago. In the wake of the growing economic and social cost of the fundamental insecurity of today's Internet environment, any proposals that alter the security dynamics online should be approached with caution. Exceptional access would force Internet system developers to reverse forward secrecy design practices that seek to minimize the impact on user privacy when systems are breached. The complexity of today's Internet environment, with millions of apps and globally connected services, means that new law enforcement requirements are likely to introduce unanticipated, hard to detect security flaws. Beyond these and other technical vulnerabilities, the prospect of globally deployed exceptional access systems raises difficult problems about how such an environment would be governed and how to ensure that such systems would respect human rights and the rule of law.

PhD Thesis Proposal: Human-Machine Collaborative Optimization via Apprenticeship Scheduling

2015年7月02日 00:00:00 GMT

PhD Thesis Proposal: Human-Machine Collaborative Optimization via Apprenticeship Scheduling Gombolay, Matthew C. Resource optimization in health care, manufacturing, and military operations requires the careful choreography of people and equipment to effectively fulfill the responsibilities of the profession. However, resource optimization is a computationally challenging problem, and poorly utilizing resources can have drastic consequences. Within these professions, there are human domain experts who are able to learn from experience to develop strategies, heuristics, and rules-of-thumb to effectively utilize the resources at their disposal. Manually codifying these heuristics within a computational tool is a laborious process and leaves much to be desired. Even with a codified set of heuristics, it is not clear how to best insert an autonomous decision-support system into the human decision-making process. The aim of this thesis is to develop an autonomous computational method for learning domain-expert heuristics from demonstration that can support the human decision-making process. We propose a new framework, called apprenticeship scheduling, which learns and embeds these heuristics within a scalable resource optimization algorithm for real-time decision-support. Our initial investigation, comprised of developing scalable methods for scheduling and studying shared control in human-machine collaborative resource optimization, inspires the development of our apprenticeship scheduling approach. We present a promising, initial prototype for learning heuristics from demonstration and outline a plan for our continuing work.

Guaranteeing Spoof-Resilient Multi-Robot Networks

Guaranteeing Spoof-Resilient Multi-Robot Networks Gil, Stephanie; Kumar, Swarun; Mazumder, Mark; Katabi, Dina; Rus, Daniela Multi-robot networks use wireless communication to provide wide-ranging services such as aerial surveillance and unmanned delivery. However, effective coordination between multiple robots requires trust, making them particularly vulnerable to cyber-attacks. Specifically, such networks can be gravely disrupted by the Sybil attack, where even a single malicious robot can spoof a large number of fake clients. This paper proposes a new solution to defend against the Sybil attack, without requiring expensive cryptographic key-distribution. Our core contribution is a novel algorithm implemented on commercial Wi-Fi radios that can "sense" spoofers using the physics of wireless signals. We derive theoretical guarantees on how this algorithm bounds the impact of the Sybil Attack on a broad class of robotic coverage problems. We experimentally validate our claims using a team of AscTec quadrotor servers and iRobot Create ground clients, and demonstrate spoofer detection rates over 96%.

Value-Deviation-Bounded Serial Data Encoding for Energy-Efficient Approximate Communication

2015年6月04日 00:00:00 GMT

Value-Deviation-Bounded Serial Data Encoding for Energy-Efficient Approximate Communication Stanley-Marbell, Phillip; Rinard, Martin Transferring data between ICs accounts for a growing proportion of system power in wearable and mobile systems. Reducing signal transitions reduces the dynamic power dissipated in this data transfer, but traditional approaches cannot be applied when the transfer interfaces are serial buses. To address this challenge, we present a family of optimal value-deviation-bounded approximate serial encoders (VDBS encoders) that significantly reduce signal transitions (and hence, dynamic power) for bit-serial communication interfaces. When the data in transfer are from sensors, VDBS encoding enables a tradeoff between power efficiency and application fidelity, by exploiting the tolerance of many of the typical algorithms consuming sensor data to deviations in values. We derive analytic formulations for the family of VDBS encoders and introduce an efficient algorithm that performs close to the Pareto-optimal encoders. We evaluate the algorithm in two applications: Encoding data between a camera and processor in a text-recognition system, and between an accelerometer and processor in a pedometer system. For the text recognizer, the algorithm reduces signal transitions by 55% on average, while maintaining OCR accuracy at over 90% for previously-correctly-recognized text. For the pedometer, the algorithm reduces signal transitions by an average of 54% in exchange for step count errors of under 5%.

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems

2015年5月29日 00:00:00 GMT

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems Qi, Zichao; Long, Fan; Achour, Sara; Rinard, Martin We analyze reported patches for three existing generate-and-validate patch generation systems (GenProg, RSRepair, and AE). The basic principle behind generate-and-validate systems is to accept only plausible patches that produce correct outputs for all inputs in the test suite used to validate the patches. Because of errors in the patch evaluation infrastructure, the majority of the reported patches are not plausible -- they do not produce correct outputs even for the inputs in the validation test suite. The overwhelming majority of the reported patches are not correct and are equivalent to a single modification that simply deletes functionality. Observed negative effects include the introduction of security vulnerabilities and the elimination of desirable standard functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss the patches produced by ClearView, a generate-and-validate binary hot patching system that leverages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct.

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems

2015年5月26日 00:00:00 GMT

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems Qi, Zichao; Long, Fan; Achour, Sara; Rinard, Martin We analyze reported patches for three existing generate-and-validate patch generation systems (GenProg, RSRepair, and AE). The basic principle behind generate-and-validate systems is to accept only plausible patches that produce correct outputs for all inputs in the test suite used to validate the patches. Because of errors in the patch evaluation infrastructure, the majority of the reported patches are not plausible --- they do not produce correct outputs even for the inputs in the validation test suite. The overwhelming majority of the reported patches are not correct and are equivalent to a single modification that simply deletes functionality. Observed negative effects include the introduction of security vulnerabilities and the elimination of desirable standard functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss patches produced by ClearView, a generate-and-validate binary hot patching system that leverages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct.

Prophet: Automatic Patch Generation via Learning from Successful Human Patches

2015年5月26日 00:00:00 GMT

Prophet: Automatic Patch Generation via Learning from Successful Human Patches Long, Fan; Rinard, Martin We present Prophet, a novel patch generation system that learns a probabilistic model over candidate patches from a large code database that contains many past successful human patches. It defines the probabilistic model as the combination of a distribution over program points based on error localization algorithms and a parameterized log-linear distribution over modification operations. It then learns the model parameters via maximum log-likelihood, which identifies important characteristics of the successful human patches. For a new defect, Prophet generates a search space that contains many candidate patches, applies the learned model to prioritize those potentially correct patches that are consistent with the identified successful patch characteristics, and then validates the candidate patches with a user supplied test suite.

Automatic Discovery and Patching of Buffer and Integer Overflow Errors

2015年5月26日 00:00:00 GMT

Automatic Discovery and Patching of Buffer and Integer Overflow Errors Sidiroglou-Douskos, Stelios; Lahtinen, Eric; Rinard, Martin We present Targeted Automatic Patching (TAP), an automatic buffer and integer overflow discovery and patching system. Starting with an application and a seed input that the application processes correctly, TAP dynamically analyzes the execution of the application to locate target memory allocation sites and statements that access dynamically or statically allocated blocks of memory. It then uses targeted error-discovery techniques to automatically generate inputs that trigger integer and/or buffer overflows at the target sites. When it discovers a buffer or integer overflow error, TAP automatically matches and applies patch templates to generate patches that eliminate the error. Our experimental results show that TAP successfully discovers and patches two buffer and six integer overflow errors in six real-world applications.

Simit: A Language for Physical Simulation

2015年5月26日 00:00:00 GMT

Simit: A Language for Physical Simulation Kjolstad, Fredrik; Kamil, Shoaib; Ragan-Kelley, Jonathan; Levin, David I.W.; Sueda, Shinjiro; Chen, Desai; Vouga, Etienne; Kaufman, Danny M.; Kanwar, Gurtej; Matusik, Wojciech; Amarasinghe, Saman Using existing programming tools, writing high-performance simulation code is labor intensive and requires sacrificing readability and portability. The alternative is to prototype simulations in a high-level language like Matlab, thereby sacrificing performance. The Matlab programming model naturally describes the behavior of an entire physical system using the language of linear algebra. However, simulations also manipulate individual geometric elements, which are best represented using linked data structures like meshes. Translating between the linked data structures and linear algebra comes at significant cost, both to the programmer and the machine. High-performance implementations avoid the cost by rephrasing the computation in terms of linked or index data structures, leaving the code complicated and monolithic, often increasing its size by an order of magnitude. In this paper, we present Simit, a new language for physical simulations that lets the programmer view the system both as a linked data structure in the form of a hypergraph, and as a set of global vectors, matrices and tensors depending on what is convenient at any given time. Simit provides a novel assembly construct that makes it conceptually easy and computationally efficient to move between the two abstractions. Using the information provided by the assembly construct, the compiler generates efficient in-place computation on the graph. We demonstrate that Simit is easy to use: a Simit program is typically shorter than a Matlab program; that it is high-performance: a Simit program running sequentially on a CPU performs comparably to hand-optimized simulations; and that it is portable: Simit programs can be compiled for GPUs with no change to the program, delivering 5-25x speedups over our optimized CPU code.

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material)

2015年5月21日 00:00:00 GMT

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material) Qi, Zichao; Long, Fan; Achour, Sara; Rinard, Martin We analyze reported patches for three prior generate-and-validate patch generation systems (GenProg, RSRepair, and AE). Because of errors in the patch evaluation infrastructure, the majority of the reported patches violate the basic principle behind the design of these systems they do not produce correct outputs even for the inputs in the test suite used to validate the patches. We also show that the overwhelming majority of the accepted patches are not correct and are equivalent to a single modification that simply deletes functionality. We also present Kali, a generate-and-validate patch generation system that only deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss the patches produced by ClearView, a generate-and-validate binary hot patching system that leverages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks. Our analysis indicates that ClearView successfully patches 9 of the 10 security vulnerabilities used to evaluate the system. At least 4 of these patches are correct.

A (Truly) Local Broadcast Layer for Unreliable Radio Networks

2015年5月18日 00:00:00 GMT

A (Truly) Local Broadcast Layer for Unreliable Radio Networks Lynch, Nancy; Newport, Calvin In this paper, we implement an efficient local broadcast service for the dual graph model, which describes communication in a radio network with both reliable and unreliable links. Our local broadcast service offers probabilistic latency guarantees for: (1) message delivery to all reliable neighbors (i.e., neighbors connected by reliable links), and (2) receiving some message when one or more reliable neighbors are broadcasting. This service significantly simplifies the design and analysis of algorithms for the otherwise challenging dual graph model. To this end, we also note that our solution can be interpreted as an implementation of the abstract MAC layer specification---therefore translating the growing corpus of algorithmic results studied on top of this layer to the dual graph model. At the core of our service is a seed agreement routine which enables nodes in the network to achieve "good enough" coordination to overcome the difficulties of unpredictable link behavior. Because this routine has potential application to other problems in this setting, we capture it with a formal specification---simplifying its reuse in other algorithms. Finally, we note that in a break from much work on distributed radio network algorithms, our problem definitions (including error bounds), implementation, and analysis do not depend on global network parameters such as the network size, a goal which required new analysis techniques. We argue that breaking the dependence of these algorithms on global parameters makes more sense and aligns better with the rise of ubiquitous computing, where devices will be increasingly working locally in an otherwise massive network. Our push for locality, in other words, is a contribution independent of the specific radio network model and problem studied here.

Non-Essential Communication in Mobile Applications

2015年5月04日 00:00:00 GMT

Non-Essential Communication in Mobile Applications Rubin, Julia; Gordon, Michael I.; Nguyen, Nguyen; Rinard, Martin This paper studies communication patterns in mobile applications. Our analysis shows that 65% of the HTTP, socket, and RPC communication in top-popular Android applications from Google Play have no effect on the user-observable application functionality. We present a static analysis that is able to detect non-essential communication with 84%-90% precision and 63%-64% recall, depending on whether advertisement content is interpreted as essential or not. We use our technique to analyze the 500 top-popular Android applications from Google Play and determine that more than 80% of the connection statements in these applications are non-essential.

Markov Chain Hallway and Poisson Forest Environment Generating Distributions

2015年4月27日 00:00:00 GMT

Markov Chain Hallway and Poisson Forest Environment Generating Distributions Richter, Charles; Vega-Brown, William; Roy, Nicholas We document two environment-generating distributions used for sampling random 2D maps. The first generates random hallway environments based on a Markov chain and the second generates random forest environments based on the Poisson distribution.

Automatic Error Elimination by Horizontal Code Transfer Across Multiple Applications

2015年4月15日 00:00:00 GMT

Automatic Error Elimination by Horizontal Code Transfer Across Multiple Applications Sidiroglou-Douskos, Stelios; Lahtinen, Eric; Long, Fan; Rinard, Martin We present Code Phage (CP), a system for automatically transferring correct code from donor applications into recipient applications that process the same inputs to successfully eliminate errors in the recipient. Experimental results using seven donor applications to eliminate ten errors in seven recipient applications highlight the ability of CP to transfer code across applications to eliminate out of bounds access, integer overflow, and divide by zero errors. Because CP works with binary donors with no need for source code or symbolic information, it supports a wide range of use cases. To the best of our knowledge, CP is the first system to automatically transfer code across multiple applications.

Horizontal Code Transfer via Program Fracture and Recombination

2015年4月14日 00:00:00 GMT

Horizontal Code Transfer via Program Fracture and Recombination Sidiroglou-Douskos, Stelios; Davis, Eli; Rinard, Martin We present a new horizontal code transfer technique, program fracture and recombination, for automatically replacing, deleting, and/or combining code from multiple applications. Benefits include automatic generation of new applications incorporating the best or most desirable functionality developed anywhere, the automatic elimination of security vulnerabilities, effective software rejuvenation, the automatic elimination of obsolete or undesirable functionality, and improved performance, simplicity, analyzability, and clarity.

A Cache Model for Modern Processors

2015年4月09日 00:00:00 GMT

A Cache Model for Modern Processors Beckmann, Nathan; Sanchez, Daniel Modern processors use high-performance cache replacement policies that outperform traditional alternatives like least-recently used (LRU). Unfortunately, current cache models use stack distances to predict LRU or its variants, and cannot capture these high-performance policies. Accurate predictions of cache performance enable many optimizations in multicore systems. For example, cache partitioning uses these predictions to divide capacity among applications in order to maximize performance, guarantee quality of service, or achieve other system objectives. Without an accurate model for high-performance replacement policies, these optimizations are unavailable to modern processors. We present a new probabilistic cache model designed for high-performance replacement policies. This model uses absolute reuse distances instead of stack distances, which makes it applicable to arbitrary age-based replacement policies. We thoroughly validate our model on several high-performance policies on synthetic and real benchmarks, where its median error is less than 1%. Finally, we present two case studies showing how to use the model to improve shared and single-stream cache performance.

iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction

2015年4月01日 00:00:00 GMT

iBCM: Interactive Bayesian Case Model Empowering Humans via Intuitive Interaction Kim, Been; Glassman, Elena; Johnson, Brittney; Shah, Julie Clustering methods optimize the partitioning of data points with respect to an internal metric, such as likelihood, in order to approximate the goodness of clustering. However, this internal metric does not necessarily translate into effective clustering from the user's perspective. This work presents the interactive Bayesian Case Model (iBCM), a model that opens a communication channel between the clustering model and the user. Users can provide direct input to iBCM in order to achieve effective clustering results, and iBCM optimizes the clustering by creating a balance between what the data indicate and what makes the most sense to the user. This model provides feedback for users and does not assume any prior knowledge of machine learning on their part. We provide quantitative evidence that users are able to obtain more satisfactory clustering results through iBCM than without an interactive model. We also demonstrate the use of this method in a real-world setting where computer language class teachers utilize iBCM to cluster students' coding assignments for grading.

A Suite of Techniques for Describing Activity in Terms of Events

2015年3月30日 00:00:00 GMT

A Suite of Techniques for Describing Activity in Terms of Events Borchardt, Gary C. This report presents a set of software techniques that support the tasks of event recognition, summarization of event sequences, explanation of recognized events, explanation of non-recognized events, prediction of event completions, and question answering by leveraging language-encoded human knowledge of what typically happens during various types of events. The techniques operate on sequences of timestamped, three-dimensional positions and contacts for humans, body parts, and objects, provided by a Microsoft Kinect sensor plus associated software. Appendices describe 64 activity sequences used for development and testing of the techniques and 102 event models created as part of the effort.

Staged Program Repair in SPR

2015年3月11日 00:00:00 GMT

Staged Program Repair in SPR Long, Fan; Rinard, Martin We present SPR, a new program repair system that uses condition synthesis to instantiate transformation schemas to repair program defects. SPR s staged repair strategy combines a rich space of potential repairs with a targeted search algorithm that makes this space viably searchable in practice. This strategy enables SPR to successfully find correct program repairs within a space that contains many meaningful and useful patches. The majority of these correct repairs are not within the search spaces of previous automatic program repair systems.

Staged Program Repair in SPR (Supplementary Material)

2015年3月05日 00:00:00 GMT

Staged Program Repair in SPR (Supplementary Material) Long, Fan; Rinard, Martin We present SPR, a new program repair system that uses condition synthesis to instantiate transformation schemas to repair program defects. SPR's staged repair strategy combines a rich space of potential repairs with a targeted search algorithm that makes this space viably searchable in practice. This strategy enables SPR to successfully find correct program repairs within a space that contains many correct patches. The majority of these correct patches are not within the search spaces of previous automatic program repair systems.

Consensus using Asynchronous Failure Detectors

2015年3月02日 00:00:00 GMT

Consensus using Asynchronous Failure Detectors Lynch, Nancy; Sastry, Srikanth The FLP result shows that crash-tolerant consensus is impossible to solve in asynchronous systems, and several solutions have been proposed for crash-tolerant consensus under alternative (stronger) models. One popular approach is to augment the asynchronous system with appropriate failure detectors, which provide (potentially unreliable) information about process crashes in the system, to circumvent the FLP impossibility. In this paper, we demonstrate the exact mechanism by which (sufficiently powerful) asynchronous failure detectors enable solving crash-tolerant consensus. Our approach, which borrows arguments from the FLP impossibility proof and the famous result from CHT, which shows that Omega is a weakest failure detector to solve consensus, also yields a natural proof to Omega as a weakest asynchronous failure detector to solve consensus. The use of I/O automata theory in our approach enables us to model execution in a more detailed fashion than CHT and also addresses the latent assumptions and assertions in the original result in CHT.

On the Formal Semantics of the Cognitive Middleware AWDRAT

2015年3月03日 00:00:00 GMT

On the Formal Semantics of the Cognitive Middleware AWDRAT Khan, Muhammad Taimoor; Serpanos, Dimitrios; Shrobe, Howard The purpose of this work is two fold: on one hand we want to formalize the behavior of critical components of the self generating and adapting cognitive middleware AWDRAT such that the formalism not only helps to understand the semantics and technical details of the middleware but also opens an opportunity to extend the middleware to support other complex application domains of cybersecurity; on the other hand, the formalism serves as a prerequisite for our proof of the behavioral correctness of the critical components to ensure the safety of the middleware itself. However, here we focus only on the core and critical component of the middleware, i.e. Execution Monitor which is a part of the module "Architectural Differencer" of AWDRAT. The role of the execution monitor is to identify inconsistencies between run-time observations of the target system and predictions of the System Architectural Model. Therefore, to achieve this goal, we first define the formal (denotational) semantics of the observations (run-time events) and predictions (executable specifications as of System Architectural Model); then based on the aforementioned formal semantics, we formalize the behavior of the "Execution Monitor" of the middleware.

Spectral Alignment of Networks

2015年2月18日 00:00:00 GMT

Spectral Alignment of Networks Feizi, Soheil; Quon, Gerald; Medard, Muriel; Kellis, Manolis; Jadbabaie, Ali Network alignment refers to the problem of finding a bijective mapping across vertices of two or more graphs to maximize the number of overlapping edges and/or to minimize the number of mismatched interactions across networks. This paper introduces a network alignment algorithm inspired by eigenvector analysis which creates a simple relaxation for the underlying quadratic assignment problem. Our method relaxes binary assignment constraints along the leading eigenvector of an alignment matrix which captures the structure of matched and mismatched interactions across networks. Our proposed algorithm denoted by EigeAlign has two steps. First, it computes the Perron-Frobenius eigenvector of the alignment matrix. Second, it uses this eigenvector in a linear optimization framework of maximum weight bipartite matching to infer bijective mappings across vertices of two graphs. Unlike existing network alignment methods, EigenAlign considers both matched and mismatched interactions in its optimization and therefore, it is effective in aligning networks even with low similarity. We show that, when certain technical conditions hold, the relaxation given by EigenAlign is asymptotically exact over Erdos-Renyi graphs with high probability. Moreover, for modular network structures, we show that EigenAlign can be used to split the large quadratic assignment optimization into small subproblems, enabling the use of computationally expensive, but tight semidefinite relaxations over each subproblem. Through simulations, we show the effectiveness of the EigenAlign algorithm in aligning various network structures including Erdos-Renyi, power law, and stochastic block models, under different noise models. Finally, we apply EigenAlign to compare gene regulatory networks across human, fly and worm species which we infer by integrating genome-wide functional and physical genomics datasets from ENCODE and modENCODE consortia. EigenAlign infers conserved regulatory interactions across these species despite large evolutionary distances spanned. We find strong conservation of centrally-connected genes and some biological pathways, especially for human-fly comparisons.

Automatic Program Repair with Condition Synthesis and Compound Mutations

2015年2月12日 00:00:00 GMT

Automatic Program Repair with Condition Synthesis and Compound Mutations Long, Fan; Qi, Zichao; Achour, Sara; Rinard, Martin We present PCR, a new automatic patch generation system. PCR uses a new condition synthesis technique to efficiently discover logical expressions that generate desired control- flow transfer patterns. Presented with a set of test cases, PCR deploys condition synthesis to find and repair incorrect if conditions that cause the application to produce the wrong result for one or more of the test cases. PCR also leverages condition synthesis to obtain a set of compound modifications that generate a rich, productive, and tractable search space of candidate patches. We evaluate PCR on a set of 105 defects from the GenProg benchmark set. For 40 of these defects, PCR generates plausible patches (patches that generate correct outputs for all inputs in the test suite used to validate the patch). For 12 of these defects, PCR generates correct patches that are functionally equivalent to developer patches that appear in subsequent versions. For comparison purposes, GenProg generates plausible patches for only 18 defects and correct patches for only 2 defects. AE generates plausible patches for only 27 defects and correct patches for only 3 defects.

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems

2015年2月10日 00:00:00 GMT

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems Qi, Zichao; Long, Fan; Achour, Sara; Rinard, Martin We analyze reported patches for three prior generate-and-validate patch generation systems (GenProg, RSRepair, and AE). Because of experimental error, the majority of the reported patches violate the basic principle behind the design of these systems -- they do not produce correct outputs even for the inputs in the test suite used to validate the patches. We also show that the overwhelming majority of the accepted patches are not correct and are equivalent to a single modification that simply deletes functionality. We also present Kali, a generate-and-validate patch generation system that simply deletes functionality. Working with a simpler and more effectively focused search space, Kali generates at least as many correct patches as prior GenProg, RSRepair, and AE systems. Kali also generates at least as many plausible patches that produce correct outputs for the inputs in the validation test suite as the three prior systems. We also discuss the patches produced by ClearView, a generate-and-validate binary hot patching system that leverages learned invariants to produce patches that enable systems to survive otherwise fatal defects and security attacks.

An Analysis of Patch Plausibility and Correctness for Generate-And-Validate Patch Generation Systems (Supplementary Material)

2015年2月02日 00:00:00 GMT

Improved Caching Strategies for Publish/Subscribe Internet Networking

2015年1月31日 00:00:00 GMT

Improved Caching Strategies for Publish/Subscribe Internet Networking Beckler, Kendra K. The systemic structure of TCP/IP is outdated; a new scheme for data transportation is needed in order to make the internet more adaptive to modern demands of mobility, information-driven demand, ever-increasing quantity of users and data, and performance requirements. While an information centric networking system addresses these issues, one required component for publish subscribe or content-addressed internet networking systems to work properly is an improved caching system. This allows the publish subscribe internet networking to dynamically route packets to mobile users, as an improvement over pure hierarchical or pure distributed caching systems. To this end, I proposed, implemented, and analyzed the workings of a superdomain caching system. The superdomain caching system is a hybrid of hierarchical and dynamic caching systems designed to continue reaping the benefits of the caching system for mobile users (who may move between neighboring domains in the midst of a network transaction) while minimizing the latency inherent in any distributed caching system to improve upon the content-addressed system. MEng thesis

Efficiently Solving Repeated Integer Linear Programming Problems by Learning Solutions of Similar Linear Programming Problems using Boosting Trees

2015年1月21日 00:00:00 GMT

Efficiently Solving Repeated Integer Linear Programming Problems by Learning Solutions of Similar Linear Programming Problems using Boosting Trees Banerjee, Ashis Gopal; Roy, Nicholas It is challenging to obtain online solutions of large-scale integer linear programming (ILP) problems that occur frequently in slightly different forms during planning for autonomous systems. We refer to such ILP problems as repeated ILP problems. The branch-and-bound (BAB) algorithm is commonly used to solve ILP problems, and a significant amount of computation time is expended in solving numerous relaxed linear programming (LP) problems at the nodes of the BAB trees. We observe that the relaxed LP problems, both within a particular BAB tree and across multiple trees for repeated ILP problems, are similar to each other in the sense that they contain almost the same number of constraints, similar objective function and constraint coefficients, and an identical number of decision variables. We present a boosting tree-based regression technique for learning a set of functions that map the objective function and the constraints to the decision variables of such a system of similar LP problems; this enables us to efficiently infer approximately optimal solutions of the repeated ILP problems. We provide theoretical performance guarantees on the predicted values and demonstrate the effectiveness of the algorithm in four representative domains involving a library of benchmark ILP problems, aircraft carrier deck scheduling, vehicle routing, and vehicle control.

Supplementary Materials for "A Survey of Corpora in Computational and Cognitive Narrative Science"

2014年12月30日 00:00:00 GMT

Supplementary Materials for "A Survey of Corpora in Computational and Cognitive Narrative Science" Finlayson, Mark Alan This archive contains supplementary materials for the article titled "A Survey of Corpora in Computational and Cognitive Narrative Science" by Mark A. Finlayson, published in the journal *Sprache und Datenverarbeitung*. The archive contains two files. The first file is the raw bibliographic data of the survey, containing 2600+ citations. The second file is a spreadsheet with the coded features of each corpus, plus the analyses that underlie sections 3 & 4 of the paper.

Queueing Theory Analysis of Labor & Delivery at a Tertiary Care Center

2014年12月16日 00:00:00 GMT

Queueing Theory Analysis of Labor & Delivery at a Tertiary Care Center Gombolay, Matthew; Golen, Toni; Shah, Neel; Shah, Julie Labor and Delivery is a complex clinical service requiring the support of highly trained healthcare professionals from Obstetrics, Anesthesiology, and Neonatology and the access to a finite set of valuable resources. In the United States, the rate of cesarean sections on labor floors is approximately twice as high as considered appropriate for patient care. We analyze one month of data from a Boston-area hospital to assess how well the labor and delivery process can be modelled with tools from queueing theory. We find that the labor and delivery process is highly amenable to analysis under queueing theory models. We also investigate the problem of high cesarean section rates and the potential effects of resource utilization of lowering the rate of cesarean section.

Network Infusion to Infer Information Sources in Networks

2014年12月02日 00:00:00 GMT

Network Infusion to Infer Information Sources in Networks Feizi, Soheil; Duffy, Ken; Kellis, Manolis; Medard, Muriel Several models exist for diffusion of signals across biological, social, or engineered networks. However, the inverse problem of identifying the source of such propagated information appears more difficult even in the presence of multiple network snapshots, and especially for the single-snapshot case, given the many alternative, often similar, progression of diffusion that may lead to the same observed snapshots. Mathematically, this problem can be undertaken using a diffusion kernel that represents diffusion processes in a given network, but computing this kernel is computationally challenging in general. Here, we propose a path-based network diffusion kernel which considers edge-disjoint shortest paths among pairs of nodes in the network and can be computed efficiently for both homogeneous and heterogeneous continuous-time diffusion models. We use this network diffusion kernel to solve the inverse diffusion problem, which we term Network Infusion (NI), using both likelihood maximization and error minimization. The minimum error NI algorithm is based on an asymmetric Hamming premetric function and can balance between false positive and false negative error types. We apply this framework for both single-source and multi-source diffusion, for both single-snapshot and multi-snapshot observations, and using both uninformative and informative prior probabilities for candidate source nodes. We also provide proofs that under a standard susceptible-infected diffusion model, (1) the maximum-likelihood NI is mean-field optimal for tree structures or sufficiently sparse Erdos-Renyi graphs, (2) the minimum-error algorithm is mean-field optimal for regular tree structures, and (3) for sufficiently-distant sources, the multi-source solution is mean-field optimal in the regular tree structure. Moreover, we provide techniques to learn diffusion model parameters such as observation times. We apply NI to several synthetic networks and compare its performance to centrality-based and distance-based methods for Erdos-Renyi graphs, power-law networks, symmetric and asymmetric grids. Moreover, we use NI in two real-world applications. First, we identify the news sources for 3,553 stories in the Digg social news network, and validate our results based on annotated information, that was not provided to our algorithm. Second, we use NI to identify infusion hubs of human diseases, defined as gene candidates that can explain the connectivity pattern of disease-related genes in the human regulatory network. NI identifies infusion hubs of several human diseases including T1D, Parkinson, MS, SLE, Psoriasis and Schizophrenia. We show that, the inferred infusion hubs are biologically relevant and often not identifiable using the raw p-values.

tBurton: A Divide and Conquer Temporal Planner

2014年10月24日 00:00:00 GMT

tBurton: A Divide and Conquer Temporal Planner Wang, David; Williams, Brian C. Planning for and controlling a network of interacting devices requires a planner that accounts for the automatic timed transitions of devices while meeting deadlines and achieving durative goals. For example, a planner for an imaging satellite with a camera intolerant of exhaust would need to determine that opening a valve causes a chain reaction that ignites the engine, and thus needs to shield its camera. While planners exist that support deadlines and durative goals, currently, no planners can handle automatic timed transitions. We present tBurton, a temporal planner that supports these features while additionally producing a temporally least-commitment plan. tBurton uses a divide and conquer approach: dividing the problem using causal-graph decomposition and conquering each factor with heuristic forward search. The `sub-plans' from each factor are unified in a conflict directed search, guided by the causal graph structure. We describe why tBurton is fast and efficient and present its efficacy on benchmarks from the International Planning Competition.

Automatic Error Elimination by Multi-Application Code Transfer

2014年10月02日 00:00:00 GMT

Automatic Error Elimination by Multi-Application Code Transfer Sidiroglou-Douskos, Stelios; Lahtinen, Eric; Rinard, Martin We present pDNA, a system for automatically transfer- ring correct code from donor applications into recipient applications to successfully eliminate errors in the recipient. Experimental results using six donor applications to eliminate nine errors in six recipient applications highlight the ability of pDNA to transfer code across applications to eliminate otherwise fatal integer and buffer overflow errors. Because pDNA works with binary donors with no need for source code or symbolic information, it supports a wide range of use cases. To the best of our knowledge, pDNA is the first system to eliminate software errors via the successful transfer of correct code across applications.

Automatic Error Elimination by Multi-Application Code Transfer

2014年9月30日 00:00:00 GMT

Automatic Error Elimination by Multi-Application Code Transfer Sidiroglou-Douskos, Stelios; Lahtinen, Eric; Long, Fan; Piselli, Paolo; Rinard, Martin We present pDNA, a system for automatically transfer- ring correct code from donor applications into recipient applications to successfully eliminate errors in the recipient. Experimental results using six donor applications to eliminate nine errors in six recipient applications highlight the ability of pDNA to transfer code across applications to eliminate otherwise fatal integer and buffer overflow errors. Because pDNA works with binary donors with no need for source code or symbolic information, it supports a wide range of use cases. To the best of our knowledge, pDNA is the first system to eliminate software errors via the successful transfer of correct code across applications.

Automatic Error Elimination by Multi-Application Code Transfer

2014年8月11日 00:00:00 GMT

Automatic Error Elimination by Multi-Application Code Transfer Sidiroglou-Douskos, Stelios; Lahtinen, Eric; Long, Fan; Piselli, Paolo; Rinard, Martin We present pDNA, a system for automatically transferring correct code from donor applications into recipient applications to successfully eliminate errors in the recipient. Experimental results using three donor applications to eliminate seven errors in four recipient applications highlight the ability of pDNA to transfer code across applications to eliminate otherwise fatal integer overflow errors at critical memory allocation sites. Because pDNA works with binary donors with no need for source code or symbolic information, it supports a wide range of use cases. To the best of our knowledge, pDNA is the first system to eliminate software errors via the successful transfer of correct code across applications.

An Analyst's Assistant for the Interpretation of Vehicle Track Data

2014年10月08日 00:00:00 GMT

An Analyst's Assistant for the Interpretation of Vehicle Track Data Borchardt, Gary; Katz, Boris; Nguyen, Hong-Linh; Felshin, Sue; Senne, Ken; Wang, Andy This report describes the Analyst's Assistant, a software system for language-interactive, collaborative user-system interpretation of events, specifically targeting vehicle events that can be recognized on the basis of vehicle track data. The Analyst's Assistant utilizes language not only as a means of interaction, but also as a basis for internal representation of scene information, background knowledge, and results of interpretation. Building on this basis, the system demonstrates emerging intelligent systems techniques related to event recognition, summarization of events, partitioning of subtasks between user and system, and handling of language and graphical references to scene entities during interactive analysis.

Automatic Error Elimination by Multi-Application Code Transfer

2014年10月02日 00:00:00 GMT

Automatic Error Elimination by Multi-Application Code Transfer Sidiroglou-Douskos, Stelios; Lahtinen, Eric; Rinard, Martin We present Code Phage (CP), a system for automatically transferring correct code from donor applications into recipient applications to successfully eliminate errors in the recipient. Experimental results using six donor applications to eliminate nine errors in six recipient applications highlight the ability of CP to transfer code across applications to eliminate otherwise fatal integer and buffer over- flow errors. Because CP works with binary donors with no need for source code or symbolic information, it supports a wide range of use cases. To the best of our knowledge, CP is the first system to eliminate software errors via the successful transfer of correct code across applications.

Constraint Generation for the Jeeves Privacy Language

2014年10月01日 00:00:00 GMT

Constraint Generation for the Jeeves Privacy Language Rose, Eva Our goal is to present a completed, semantic formalization of the Jeeves privacy language evaluation engine, based on the original Jeeves constraint semantics defined by Yang et al at POPL12, but sufficiently strong to support a first complete implementation thereof. Specifically, we present and implement a syntactically and semantically completed concrete syntax for Jeeves that meets the example criteria given in the paper. We also present and implement the associated translation to J, but here formulated by a completed and decompositional operational semantic formulation. Finally, we present an enhanced and decompositional, non-substitutional operational semantic formulation and implementation of the J evaluation engine (the dynamic semantics) with privacy constraints. In particular, we show how implementing the constraints can be defined as a monad, and evaluation can be defined as monadic operation on the constraint environment. The implementations are all completed in Haskell, utilizing its almost one-to-one capability to transparently reflect the underlying semantic reasoning when formalized this way. In practice, we have applied the "literate" program facility of Haskell to this report, a feature that enables the source LATEX to also serve as the source code for the implementation (skipping the report-parts as comment regions). The implementation is published as a github project.

OpLog: a library for scaling update-heavy data structures

2014年9月16日 00:00:00 GMT

OpLog: a library for scaling update-heavy data structures Boyd-Wickizer, Silas; Kaashoek, M. Frans; Morris, Robert; Zeldovich, Nickolai Existing techniques (e.g., RCU) can achieve good multi-core scaling for read-mostly data, but for update-heavy data structures only special-purpose techniques exist. This paper presents OpLog, a general-purpose library supporting good scalability for update-heavy data structures. OpLog achieves scalability by logging each update in a low-contention per-core log; it combines logs only when required by a read to the data structure. OpLog achieves generality by logging operations without having to understand them, to ease application to existing data structures. OpLog can further increase performance if the programmer indicates which operations can be combined in the logs. An evaluation shows how to apply OpLog to three update-heavy Linux kernel data structures. Measurements on a 48-core AMD server show that the result significantly improves the performance of the Apache web server and the Exim mail server under certain workloads.

Alloy*: A Higher-Order Relational Constraint Solver

2014年9月02日 00:00:00 GMT

Alloy*: A Higher-Order Relational Constraint Solver Milicevic, Aleksandar; Near, Joseph P.; Kang, Eunsuk; Jackson, Daniel The last decade has seen a dramatic growth in the use of constraint solvers as a computational mechanism, not only for analysis and synthesis of software, but also at runtime. Solvers are available for a variety of logics but are generally restricted to first-order formulas. Some tasks, however, most notably those involving synthesis, are inherently higher order; these are typically handled by embedding a first-order solver (such as a SAT or SMT solver) in a domain-specific algorithm. Using strategies similar to those used in such algorithms, we show how to extend a first-order solver (in this case Kodkod, a model finder for relational logic used as the engine of the Alloy Analyzer) so that it can handle quantifications over higher-order structures. The resulting solver is sufficiently general that it can be applied to a range of problems; it is higher order, so that it can be applied directly, without embedding in another algorithm; and it performs well enough to be competitive with specialized tools on standard benchmarks. Although the approach is demonstrated for a particular relational logic, the principles behind it could be applied to other first-order solvers. Just as the identification of first-order solvers as reusable backends advanced the performance of specialized tools and simplified their architecture, factoring out higher-ordersolvers may bring similar benefits to a new class of tools.

Motion Compatibility for Indoor Localization

2014年8月26日 00:00:00 GMT

Motion Compatibility for Indoor Localization Park, Jun-geun; Teller, Seth Indoor localization -- a device's ability to determine its location within an extended indoor environment -- is a fundamental enabling capability for mobile context-aware applications. Many proposed applications assume localization information from GPS, or from WiFi access points. However, GPS fails indoors and in urban canyons, and current WiFi-based methods require an expensive, and manually intensive, mapping, calibration, and configuration process performed by skilled technicians to bring the system online for end users. We describe a method that estimates indoor location with respect to a prior map consisting of a set of 2D floorplans linked through horizontal and vertical adjacencies. Our main contribution is the notion of "path compatibility," in which the sequential output of a classifier of inertial data producing low-level motion estimates (standing still, walking straight, going upstairs, turning left etc.) is examined for agreement with the prior map. Path compatibility is encoded in an HMM-based matching model, from which the method recovers the user s location trajectory from the low-level motion estimates. To recognize user motions, we present a motion labeling algorithm, extracting fine-grained user motions from sensor data of handheld mobile devices. We propose "feature templates," which allows the motion classifier to learn the optimal window size for a specific combination of a motion and a sensor feature function. We show that, using only proprioceptive data of the quality typically available on a modern smartphone, our motion labeling algorithm classifies user motions with 94.5% accuracy, and our trajectory matching algorithm can recover the user's location to within 5 meters on average after one minute of movements from an unknown starting location. Prior information, such as a known starting floor, further decreases the time required to obtain precise location estimate.

Energy-Efficient Approximate Computation in Topaz

2014年8月19日 00:00:00 GMT

Energy-Efficient Approximate Computation in Topaz Achour, Sara; Rinard, Martin We present Topaz, a new task-based language for computations that execute on approximate computing platforms that may occasionally produce arbitrarily inaccurate results. The Topaz implementation maps approximate tasks onto the approximate machine and integrates the approximate results into the main computation, deploying a novel outlier detection and reliable reexecution mechanism to prevent unacceptably inaccurate results from corrupting the overall computation. Topaz therefore provides the developers of approximate hardware with substantial freedom in producing designs with little or no precision or accuracy guarantees. Experimental results from our set of benchmark applications demonstrate the effectiveness of Topaz and the Topaz implementation in enabling developers to productively exploit emerging approximate hardware platforms.

A Coded Shared Atomic Memory Algorithm for Message Passing Architectures

2014年8月01日 00:00:00 GMT

A Coded Shared Atomic Memory Algorithm for Message Passing Architectures Cadambe, Viveck R.; Lynch, Nancy; Medard, Muriel; Musial, Peter This paper considers the communication and storage costs of emulating atomic (linearizable) multi-writer multi-reader shared memory in distributed message-passing systems. The paper contains three main contributions: (1) We present a atomic shared-memory emulation algorithm that we call Coded Atomic Storage (CAS). This algorithm uses erasure coding methods. In a storage system with 'N' servers that is resilient to 'f' server failures, we show that the communication cost of CAS is N/(N-2f) . The storage cost of CAS is unbounded. (2) We present a modification of the CAS algorithm known as CAS with Garbage Collection (CASGC). The CASGC algorithm is parametrized by an integer 'd' and has a bounded storage cost. We show that in every execution where the number of write operations that are concurrent with a read operation is no bigger than 'd', the CASGC algorithm with parameter 'd' satisfies atomicity and liveness. We explicitly characterize the storage cost of CASGC, and show that it has the same communication cost as CASGC. (3) We describe an algorithm known as the Communication Cost Optimal Atomic Storage (CCOAS) algorithm that achieves a smaller communication cost than CAS and CASGC. In particular, CCOAS incurs read and write communication costs of N/(N-f) measured in terms of number of object values. We also discuss drawbacks of CCOAS as compared with CAS and CASGC.

Autotuning Algorithmic Choice for Input Sensitivity

2014年6月23日 00:00:00 GMT

Autotuning Algorithmic Choice for Input Sensitivity Ding, Yufei; Ansel, Jason; Veeramachaneni, Kalyan; Shen, Xipeng; O'Reilly, Una-May; Amarasinghe, Saman Empirical autotuning is increasingly being used in many domains to achieve optimized performance in a variety of different execution environments. A daunting challenge faced by such autotuners is input sensitivity, where the best autotuned configuration may vary with different input sets. In this paper, we propose a two level solution that: first, clusters to find input sets that are similar in input feature space; then, uses an evolutionary autotuner to build an optimized program for each of these clusters; and, finally, builds an adaptive overhead aware classifier which assigns each input to a specific input optimized program. Our approach addresses the complex trade-off between using expensive features, to accurately characterize an input, and cheaper features, which can be computed with less overhead. Experimental results show that by adapting to different inputs one can obtain up to a 3x speedup over using a single configuration for all inputs.

Possibilistic Beliefs and Higher-Level Rationality

2014年6月09日 00:00:00 GMT

Possibilistic Beliefs and Higher-Level Rationality Chen, Jing; Micali, Silvio; Pass, Rafael We consider rationality and rationalizability for normal-form games of incomplete information in which the players have possibilistic beliefs about their opponents. In this setting, we prove that the strategies compatible with the players being level-k rational coincide with the strategies surviving a natural k-step iterated elimination procedure. We view the latter strategies as the (level-k) rationalizable ones in our possibilistic setting.

Possibilistic Beliefs and Higher-Level Rationality

2014年6月09日 00:00:00 GMT

Latent Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification

2014年5月26日 00:00:00 GMT

Latent Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classification Kim, Been; Rudin, Cynthia; Shah, Julie We present a general framework for Bayesian case-based reasoning and prototype classification and clustering -- Latent Case Model (LCM). LCM learns the most representative prototype observations of a dataset by performing joint inference on cluster prototypes and features. Simultaneously, LCM pursues sparsity by learning subspaces, the sets of few features that play important roles in characterizing the prototypes. The prototype and subspace representation preserves interpretability in high dimensional data. We validate the approach preserves classification accuracy on standard data sets, and verify through human subject experiments that the output of LCM produces statistically significant improvements in participants' performance on a task requiring an understanding of clusters within a dataset.

Quaternionic Representation of the Riesz Pyramid for Video Magnification

2014年4月26日 00:00:00 GMT

Quaternionic Representation of the Riesz Pyramid for Video Magnification Wadhwa, Neal; Rubinstein, Michael; Durand, Fredo; Freeman, William T. Recently, we presented a new image pyramid, called the Riesz pyramid, that uses the Riesz transform to manipulate the phase in non-oriented sub-bands of an image sequence to produce real-time motion-magnified videos. In this report we give a quaternionic formulation of the Riesz pyramid, and show how several seemingly heuristic choices in how to use the Riesz transform for phase-based video magnification fall out of this formulation in a natural and principled way. We intend this report to accompany the original paper on the Riesz pyramid for video magnification.

Multi-Person Motion Tracking via RF Body Reflections

2014年4月26日 00:00:00 GMT

Multi-Person Motion Tracking via RF Body Reflections Adib, Fadel; Kabelac, Zachary; Katabi, Dina Recently, we have witnessed the emergence of technologies that can localize a user and track her gestures based purely on radio reflections off the person's body. These technologies work even if the user is behind a wall or obstruction. However, for these technologies to be fully practical, they need to address major challenges such as scaling to multiple people, accurately localizing them and tracking their gestures, and localizing static users as opposed to requiring the user to move to be detectable. This paper presents WiZ, the first multi-person centimeter-scale motion tracking system that pinpoints people's locations based purely on RF reflections off their bodies. WiZ can also locate static users by sensing minute changes in their RF reflections due to breathing. Further, it can track concurrent gestures made by different individuals, even when they carry no wireless device on them. We implement a prototype of WiZ and show that it can localize up to five users each with a median accuracy of 8-18 cm and 7-11 cm in the x and y dimensions respectively. WiZ can also detect 3D pointing gestures of multiple users with a median orientation error of 8 -16 degrees for each of them. Finally, WiZ can track breathing motion and output the breath count of multiple people with high accuracy.

One Clock to Rule Them All: A Primitive for Distributed Wireless Protocols at the Physical Layer

2014年4月27日 00:00:00 GMT

One Clock to Rule Them All: A Primitive for Distributed Wireless Protocols at the Physical Layer Abari, Omid; Rahul, Hariharan; Katabi, Dina Implementing distributed wireless protocols at the physical layer today is challenging because different nodes have different clocks, each of which has slightly different frequencies. This causes the nodes to have frequency offset relative to each other, as a result of which transmitted signals from these nodes do not combine in a predictable manner over time. Past work tackles this challenge and builds distributed PHY layer systems by attempting to address the effects of the frequency offset and compensating for it in the transmitted signals. In this paper, we address this challenge by addressing the root cause - the different clocks with different frequencies on the different nodes. We present AirClock, a new wireless coordination primitive that enables multiple nodes to act as if they are driven by a single clock that they receive wirelessly over the air. AirClock presents a synchronized abstraction to the physical layer, and hence enables direct implementation of diverse kinds of distributed PHY protocols. We illustrate AirClock's versatility by using it to build three different systems: distributed MIMO, distributed rate adaptation for wireless sensors, and pilotless OFDM, and show that they can provide significant performance benefits over today's systems.

Symbolic Execution for (Almost) Free: Hijacking an Existing Implementation to Perform Symbolic Execution

2014年4月22日 00:00:00 GMT

Symbolic Execution for (Almost) Free: Hijacking an Existing Implementation to Perform Symbolic Execution Near, Joseph P.; Jackson, Daniel Symbolic execution of a language is traditionally achieved by replacing the language s interpreter with an entirely new interpreter. This may be an unnecessary burden, and it is tempting instead to try to use as much of the existing interpret infrastructure as possible, both for handling aspects of the computation that are not symbolic, and for propagating symbolic ones. This approach was used to implement Rubicon, a bounded verification system for Ruby on Rails web applications, in less than 1000 lines of Ruby code. Rubicon uses symbolic execution to derive verification conditions from Rails applications and an off-the-shelf solver to check them. Despite its small size, Rubicon has been used to find previously unknown bugs in open-source Rails applications. The key idea is to encode symbolic values and operations in a library written in the target language itself, overriding only a small part of the standard interpreter. We formalize this approach, showing that replacing a few key operators with symbolic versions in a standard interpreter gives the same effect as replacing the entire interpreter with a symbolic one.

Moebius Language Reference, Version 1.2

2014年4月09日 00:00:00 GMT

Moebius Language Reference, Version 1.2 Borchardt, Gary C. Moebius is a representation and interface language based on a subset of English. It is designed for use as a means of encoding information and as a means of conveying information between software components and other software components, between software components and humans, and between data repositories and their users -- human or machine. This report describes the structure and use of the Moebius language and presents three applications of the language to date.

Sloth: Being Lazy is a Virtue (When Issuing Database Queries)

2014年4月14日 00:00:00 GMT

Sloth: Being Lazy is a Virtue (When Issuing Database Queries) Cheung, Alvin; Madden, Samuel; Solar-Lezama, Armando Many web applications store persistent data in databases. During execution, such applications spend a significant amount of time communicating with the database for retrieval and storing of persistent data over the network. These network round trips represent a significant fraction of the overall execution time for many applications and as a result increase their latency. While there has been prior work that aims to eliminate round trips by batching queries, they are limited by 1) a requirement that developers manually identify batching opportunities, or 2) the fact that they employ static program analysis techniques that cannot exploit many opportunities for batching. In this paper, we present Sloth, a new system that extends traditional lazy evaluation to expose query batching opportunities during application execution, even across loops, branches, and method boundaries. We evaluated Sloth using over 100 benchmarks from two large-scale open-source applications, and achieved up to a 3x reduction in page load time by delaying computation.

Cicada: Predictive Guarantees for Cloud Network Bandwidth

2014年3月24日 00:00:00 GMT

Cicada: Predictive Guarantees for Cloud Network Bandwidth LaCurts, Katrina; Mogul, Jeffrey C.; Balakrishnan, Hari; Turner, Yoshio In cloud-computing systems, network-bandwidth guarantees have been shown to improve predictability of application performance and cost. Most previous work on cloud-bandwidth guarantees has assumed that cloud tenants know what bandwidth guarantees they want. However, application bandwidth demands can be complex and time-varying, and many tenants might lack sufficient information to request a bandwidth guarantee that is well-matched to their needs. A tenant's lack of accurate knowledge about its future bandwidth demands can lead to over-provisioning (and thus reduced cost-efficiency) or under-provisioning (and thus poor user experience in latency-sensitive user-facing applications). We analyze traffic traces gathered over six months from an HP Cloud Services datacenter, finding that application bandwidth consumption is both time-varying and spatially inhomogeneous. This variability makes it hard to predict requirements. To solve this problem, we develop a prediction algorithm usable by a cloud provider to suggest an appropriate bandwidth guarantee to a tenant. The key idea in the prediction algorithm is to treat a set of previously observed traffic matrices as "experts" and learn online the best weighted linear combination of these experts to make its prediction. With tenant VM placement using these predictive guarantees, we find that the inter-rack network utilization in certain datacenter topologies can be more than doubled.

The N2 Corpus v1.0

2014年3月22日 00:00:00 GMT

The N2 Corpus v1.0 Finlayson, Mark A.; Halverson, Jeffry R.; Corman, Steven R. The N2 Corpus (Narrative Networks Corpus) comprises 100 story texts (42,480 words) relevant to Islamist Extremism, drawn from religious stories, online material, and promotional magazines. The corpus has been annotated for 14 different layers of syntax and semantics. This v1.0 version is missing 33 texts that will be added in later versions. The corpus is described in: Mark A. Finlayson, Jeffry R. Halverson, and Steven R. Corman (2014) "The N2 Corpus: A semantically annotated collection of Islamist extremist stories", Proceedings of the 9th Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland.

An Architecture for Online Affordance-based Perception and Whole-body Planning

2014年3月16日 00:00:00 GMT

An Architecture for Online Affordance-based Perception and Whole-body Planning Fallon, Maurice; Kuindersma, Scott; Karumanchi, Sisir; Antone, Matthew; Schneider, Toby; Dai, Hongkai; Perez D'Arpino, Claudia; Deits, Robin; DiCicco, Matt; Fourie, Dehann; Koolen, Twan; Marion, Pat; Posa, Michael; Valenzuela, Andres; Yu, Kuan-Ting; Shah, Julie; Iagnemma, Karl; Tedrake, Russ; Teller, Seth The DARPA Robotics Challenge Trials held in December 2013 provided a landmark demonstration of dexterous mobile robots executing a variety of tasks aided by a remote human operator using only data from the robot's sensor suite transmitted over a constrained, field-realistic communications link. We describe the design considerations, architecture, implementation and performance of the software that Team MIT developed to command and control an Atlas humanoid robot. Our design emphasized human interaction with an efficient motion planner, where operators expressed desired robot actions in terms of affordances fit using perception and manipulated in a custom user interface. We highlight several important lessons we learned while developing our system on a highly compressed schedule.

PIKA: A Network Service for Multikernel Operating Systems

2014年1月28日 00:00:00 GMT

PIKA: A Network Service for Multikernel Operating Systems Beckmann, Nathan Z.; Gruenwald III, Charles; Johnson, Christopher R.; Kasture, Harshad; Sironi, Filippo; Agarwal, Anant; Kaashoek, M. Frans; Zeldovich, Nickolai PIKA is a network stack designed for multikernel operating systems that target potential future architectures lacking cache-coherent shared memory but supporting message passing. PIKA splits the network stack into several servers that communicate using a low-overhead message passing layer. A key challenge faced by PIKA is the maintenance of shared state, such as a single accept queue and load balance information. PIKA addresses this challenge using a speculative 3-way handshake for connection acceptance, and a new distributed load balancing scheme for spreading connections. A PIKA prototype achieves competitive performance, excellent scalability, and low service times under load imbalance on commodity hardware. Finally, we demonstrate that splitting network stack processing by function across separate cores is a net loss on commodity hardware, and we describe conditions under which it may be advantageous.

Reliability-Aware Optimization of Approximate Computational Kernels with Rely

2014年1月09日 00:00:00 GMT

Reliability-Aware Optimization of Approximate Computational Kernels with Rely Misailovic, Sasa; Carbin, Michael; Achour, Sara; Qi, Zichao; Rinard, Martin Emerging high-performance architectures are anticipated to contain unreliable components (e.g., ALUs) that offer low power consumption at the expense of soft errors. Some applications (such as multimedia processing, machine learning, and big data analytics) can often naturally tolerate soft errors and can therefore trade accuracy of their results for reduced energy consumption by utilizing these unreliable hardware components. We present and evaluate a technique for reliability-aware optimization of approximate computational kernel implementations. Our technique takes a standard implementation of a computation and automatically replaces some of its arithmetic operations with unreliable versions that consume less power, but may produce incorrect results with some probability. Our technique works with a developer-provided specification of the required reliability of a computation -- the probability that it returns the correct result -- and produces an unreliable implementation that satisfies that specification. We evaluate our approach on five applications from the image processing, numerical analysis, and financial analysis domains and demonstrate how our technique enables automatic exploration of the trade-off between the reliability of a computation and its performance.

Synthesis of Randomized Accuracy-Aware Map-Fold Programs

2013年12月29日 00:00:00 GMT

Synthesis of Randomized Accuracy-Aware Map-Fold Programs Misailovic, Sasa; Rinard, Martin We present Syndy, a technique for automatically synthesizing randomized map/fold computations that trade accuracy for performance. Given a specification of a fully accurate computation, Syndy automatically synthesizes approximate implementations of map and fold tasks, explores the approximate computation space that these approximations induce, and derives an accuracy versus performance tradeoff curve that characterizes the explored space. Each point on the curve corresponds to an approximate randomized program configuration that realizes the probabilistic error and time bounds associated with that point.

3D Tracking via Body Radio Reflections

2013年12月11日 00:00:00 GMT

3D Tracking via Body Radio Reflections Adib, Fadel; Kabelac, Zach; Katabi, Dina; Miller, Robert C. This paper introduces WiTrack, a system that tracks the 3D motion of a user from the radio signals reflected off her body. It works even if the person is occluded from the WiTrack device or in a different room. WiTrack does not require the user to carry any wireless device, yet its accuracy exceeds current RF localization systems, which require the user to hold a transceiver. Empirical measurements with a WiTrack prototype show that, on average, it localizes the center of a human body to within 10 to 13 cm in the x and y dimensions, and 21 cm in the z dimension. It also provides coarse tracking of body parts, identifying the direction of a pointing hand with a median of 11.2 degrees. WiTrack bridges a gap between RF-based localization systems which locate a user through walls and occlusions, and human-computer interaction systems like WiTrack, which can track a user without instrumenting her body, but require the user to stay within the direct line of sight of the device.

Bridging Utility Maximization and Regret Minimization

2013年12月03日 00:00:00 GMT

Bridging Utility Maximization and Regret Minimization Chiesa, Alessandro; Micali, Silvio; Zhu, Zeyuan Allen We relate the strategies obtained by (1) utility maximizers who use regret to refine their set of undominated strategies, and (2) regret minimizers who use weak domination to refine their sets of regret-minimizing strategies.

GenBase: A Complex Analytics Genomics Benchmark

2013年11月19日 00:00:00 GMT

GenBase: A Complex Analytics Genomics Benchmark Taft, Rebecca; Vartak, Manasi; Satish, Nadathur Rajagopalan; Sundaram, Narayanan; Madden, Samuel; Stonebraker, Michael This paper introduces a new benchmark, designed to test database management system (DBMS) performance on a mix of data management tasks (joins, filters, etc.) and complex analytics (regression, singular value decomposition, etc.) Such mixed workloads are prevalent in a number of application areas, including most science workloads and web analytics. As a specific use case, we have chosen genomics data for our benchmark, and have constructed a collection of typical tasks in this area. In addition to being representative of a mixed data management and analytics workload, this benchmark is also meant to scale to large dataset sizes and multiple nodes across a cluster. Besides presenting this benchmark, we have run it on a variety of storage systems including traditional row stores, newer column stores, Hadoop, and an array DBMS. We present performance numbers on all systems on single and multiple nodes, and show that performance differs by orders of magnitude between the various solutions. In addition, we demonstrate that most platforms have scalability issues. We also test offloading the analytics onto a coprocessor. The intent of this benchmark is to focus research interest in this area; to this end, all of our data, data generators, and scripts are available on our web site.

On Randomized Path Coverage of Configuration Spaces

2013年11月18日 00:00:00 GMT

On Randomized Path Coverage of Configuration Spaces Perez, Alejandro We present a sampling-based algorithm that generates a set of locally-optimal paths that differ in visibility.

OpenTuner: An Extensible Framework for Program Autotuning

2013年11月01日 00:00:00 GMT

OpenTuner: An Extensible Framework for Program Autotuning Ansel, Jason; Kamil, Shoaib; Veeramachaneni, Kalyan; O'Reilly, Una-May; Amarasinghe, Saman Program autotuning has been shown to achieve better or more portable performance in a number of domains. However, autotuners themselves are rarely portable between projects, for a number of reasons: using a domain-informed search space representation is critical to achieving good results; search spaces can be intractably large and require advanced machine learning techniques; and the landscape of search spaces can vary greatly between different problems, sometimes requiring domain specific search techniques to explore efficiently. This paper introduces OpenTuner, a new open source framework for building domain-specific multi-objective program autotuners. OpenTuner supports fully-customizable configuration representations, an extensible technique representation to allow for domain-specific techniques, and an easy to use interface for communicating with the program to be autotuned. A key capability inside OpenTuner is the use of ensembles of disparate search techniques simultaneously; techniques that perform well will dynamically be allocated a larger proportion of tests. We demonstrate the efficacy and generality of OpenTuner by building autotuners for 6 distinct projects and 14 total benchmarks, showing speedups over prior techniques of these projects of up to 2.8x with little programmer effort.

Code for Java Libraries for Accessing the Princeton Wordnet: Comparison and Evaluation

2013年11月01日 00:00:00 GMT

Code for Java Libraries for Accessing the Princeton Wordnet: Comparison and Evaluation Finlayson, Mark Alan This archive contains the code and data for running the evaluations described in: Finlayson, Mark Alan (2014) "Java Libraries for Accessing the Princeton Wordnet: comparison and Evaluation" in Proceedings of the 7th Global Wordnet Conference (GWC 2014). Tartu, Estonia, 25-29 January 2014. The archive contains five Eclipse projects (compatible with Eclipse 3.8.0) that may be imported directly into an Eclipse workspace. You will need a Java 1.4, 1.5, and 1.6 JRE to run all the code in the archive. Paper abstract: Java is a popular programming language for natural language processing. I compare and evaluate 12 Java libraries designed to access the information in the original Princeton Wordnet databases. From this comparison emerges a set of decision criteria that will enable a user to pick the library most suited to their purposes. I identify five deciding features: (1) availability of similarity metrics; (2) support for editing; (3) availability via Maven; (4) compatibility with retired Java versions; and (5) support for Enterprise Java. I also provide a comparison of other features of each library, the information exposed by each API, and the versions of Wordnet each library supports, and I evaluate each library for the speed of various retrieval operations. In the case that the user's application does not require one of the deciding features, I show that my library, JWI, the MIT Java Wordnet Interface, is the highest-performance, widest-coverage, easiest-to-use library available.

Asynchronous Failure Detectors

2013年10月10日 00:00:00 GMT

Asynchronous Failure Detectors Cornejo, Alejandro; Lynch, Nancy; Sastry, Srikanth Failure detectors -- oracles that provide information about process crashes -- are an important abstraction for crash tolerance in distributed systems. The generality of failure-detector theory, while providing great expressiveness, poses significant challenges in developing a robust hierarchy of failure detectors. We address some of these challenges by proposing (1) a variant of failure detectors called asynchronous failure detectors and (2) an associated modeling framework. Unlike the traditional failure-detector framework, our framework eschews real-time completely. We show that asynchronous failure detectors are sufficiently expressive to include several popular failure detectors including, but not limited to, the canonical Chandra-Toueg failure detectors, Sigma and other quorum failure detectors, Omega, anti-Omega, Omega^k, and Psi_k. Additionally, asynchronous failure detectors satisfy many desirable properties: they are self-implementable, guarantee that stronger asynchronous failure-detectors solve harder problems, and ensure that their outputs encode no information other than the set of crashed processes. We introduce the notion of a failure detector being representative for a problem to capture the idea that some problems encode the same information about process crashes as their weakest failure detectors do. We show that a large class of problems, called bounded problems, do not have representative failure detectors. Finally, we use the asynchronous failure-detector framework to show how sufficiently strong AFDs circumvent the impossibility of consensus in asynchronous systems. This report supersedes MIT-CSAIL-TR-2013-002.

Distributed Shared State with History Maintenance

2013年10月08日 00:00:00 GMT

Distributed Shared State with History Maintenance Panchekha, Pavel; Brodsky, Micah Z. (Micah Zev) Shared mutable state is challenging to maintain in a distributed environment. We develop a technique, based on the Operational Transform, that guides independent agents into producing consistent states through inconsistent but equivalent histories of operations. Our technique, history maintenance, extends and streamlines the Operational Transform for general distributed systems. We describe how to use history maintenance to create eventually-consistent, strongly-consistent, and hybrid systems whose correctness is easy to reason about.

Mouse Behavior Recognition with The Wisdom of Crowd

2013年9月19日 00:00:00 GMT

Mouse Behavior Recognition with The Wisdom of Crowd Ni, Yuzhao; Frogner, Charles A.; Poggio, Tomaso A In this thesis, we designed and implemented a crowdsourcing system to annotatemouse behaviors in videos; this involves the development of a novel clip-based video labeling tools, that is more efficient than traditional labeling tools in crowdsourcing platform, as well as the design of probabilistic inference algorithms that predict the true labels and the workers' expertise from multiple workers' responses. Our algorithms are shown to perform better than majority vote heuristic. We also carried out extensive experiments to determine the effectiveness of our labeling tool, inference algorithms and the overall system.

Harvesting Application Information for Industry-Scale Relational Schema Matching

2013年9月10日 00:00:00 GMT

Harvesting Application Information for Industry-Scale Relational Schema Matching Kushman, Nate; Adib, Fadel; Katabi, Dina; Barzilay, Regina Consider the problem of migrating a company's CRM or ERP database from one application to another, or integrating two such databases as a result of a merger. This problem requires matching two large relational schemas with hundreds and sometimes thousands of fields. Further, the correct match is likely complex: rather than a simple one-to-one alignment, some fields in the source database may map to multiple fields in the target database, and others may have no equivalent fields in the target database. Despite major advances in schema matching, fully automated solutions to large relational schema matching problems are still elusive. This paper focuses on improving the accuracy of automated large relational schema matching. Our key insight is the observation that modern database applications have a rich user interface that typically exhibits more consistency across applications than the underlying schemas. We associate UI widgets in the application with the underlying database fields on which they operate and demonstrate that this association delivers new information useful for matching large and complex relational schemas. Additionally, we show how to formalize the schema matching problem as a quadratic program, and solve it efficiently using standard optimization and machine learning techniques. We evaluate our approach on real-world CRM applications with hundreds of fields and show that it improves the accuracy by a factor of 2-4x.

Optimal Bidirectional Rapidly-Exploring Random Trees

2013年8月15日 00:00:00 GMT

Optimal Bidirectional Rapidly-Exploring Random Trees Jordan, Matthew; Perez, Alejandro In this paper we present a simple, computationally-efficient, two-tree variant of the RRT* algorithm along with several heuristics.

Does invariant recognition predict tuning of neurons in sensory cortex?

2013年8月06日 00:00:00 GMT

Does invariant recognition predict tuning of neurons in sensory cortex? Poggio, Tomaso; Mutch, Jim; Anselmi, Fabio; Tacchetti, Andrea; Rosasco, Lorenzo; Leibo, Joel Z. Tuning properties of simple cells in cortical V1 can be described in terms of a "universal shape" characterized by parameter values which hold across different species. This puzzling set of findings begs for a general explanation grounded on an evolutionarily important computational function of the visual cortex. We ask here whether these properties are predicted by the hypothesis that the goal of the ventral stream is to compute for each image a "signature" vector which is invariant to geometric transformations, with the the additional assumption that the mechanism for continuously learning and maintaining invariance consists of the memory storage of a sequence of neural images of a few objects undergoing transformations (such as translation, scale changes and rotation) via Hebbian synapses. For V1 simple cells the simplest version of this hypothesis is the online Oja rule which implies that the tuning of neurons converges to the eigenvectors of the covariance of their input. Starting with a set of dendritic fields spanning a range of sizes, simulations supported by a direct mathematical analysis show that the solution of the associated "cortical equation" provides a set of Gabor-like wavelets with parameter values that are in broad agreement with the physiology data. We show however that the simple version of the Hebbian assumption does not predict all the physiological properties. The same theoretical framework also provides predictions about the tuning of cells in V4 and in the face patch AL which are in qualitative agreement with physiology data.

Sound Input Filter Generation for Integer Overflow Errors

2013年8月06日 00:00:00 GMT

Sound Input Filter Generation for Integer Overflow Errors Long, Fan; Sidiroglou-Douskos, Stelios; Kim, Deokhwan; Rinard, Martin We present a system, SIFT, for generating input filters that nullify integer overflow errors associated with critical program sites such as memory allocation or block copy sites. SIFT uses a static program analysis to generate filters that discard inputs that may trigger integer overflow errors in the computations of the sizes of allocated memory blocks or the number of copied bytes in block copy operations. The generated filters are sound if an input passes the filter, it will not trigger an integer overflow error for any analyzed site. Our results show that SIFT successfully analyzes (and therefore generates sound input filters for) 52 out of 58 memory allocation and block memory copy sites in analyzed input processing modules from five applications (VLC, Dillo, Swfdec, Swftools, and GIMP). These nullified errors include six known integer overflow vulnerabilities. Our results also show that applying these filters to 62895 real-world inputs produces no false positives. The analysis and filter generation times are all less than a second.

Conceptual Design of Software: A Research Agenda

2013年8月08日 00:00:00 GMT

Conceptual Design of Software: A Research Agenda Jackson, Daniel A research agenda in software design is outlined, focusing on the role of concepts. The notions of concepts as "abstract affordances" and of conceptual integrity are discussed, and a series of small examples of conceptual models is given.

Jigsaw: Scalable Software-Defined Caches (Extended Version)

2013年9月01日 00:00:00 GMT

Jigsaw: Scalable Software-Defined Caches (Extended Version) Beckmann, Nathan; Sanchez, Daniel Shared last-level caches, widely used in chip-multiprocessors (CMPs), face two fundamental limitations. First, the latency and energy of shared caches degrade as the system scales up. Second, when multiple workloads share the CMP, they suffer from interference in shared cache accesses. Unfortunately, prior research addressing one issue either ignores or worsens the other: NUCA techniques reduce access latency but are prone to hotspots and interference, and cache partitioning techniques only provide isolation but do not reduce access latency. We present Jigsaw, a technique that jointly addresses the scalability and interference problems of shared caches. Hardware lets software define shares, collections of cache bank partitions that act as virtual caches, and map data to shares. Shares give software full control over both data placement and capacity allocation. Jigsaw implements efficient hardware support for share management, monitoring, and adaptation. We propose novel resource-management algorithms and use them to develop a system-level runtime that leverages Jigsaw to both maximize cache utilization and place data close to where it is used. We evaluate Jigsaw using extensive simulations of 16- and 64-core tiled CMPs. Jigsaw improves performance by up to 2.2x (18% avg) over a conventional shared cache, and significantly outperforms state-of-the-art NUCA and partitioning techniques.

Coded Emulation of Shared Atomic Memory for Message Passing Architectures

2013年7月17日 00:00:00 GMT

Coded Emulation of Shared Atomic Memory for Message Passing Architectures Cadambe, Viveck R.; Lynch, Nancy; Medard, Muriel; Musial, Peter This paper considers the communication and storage costs of emulating atomic (linearizable) read/write shared memory in distributed message-passing systems. We analyze the costs of previously-proposed algorithms by Attiya, Bar-Noy, and Dolev (the ABD algorithm) and by Fan and Lynch (the LDR algorithm), and develop new coding-based algorithms that significantly reduce these costs. The paper contains three main contributions: (1) We present a new shared-memory algorithm that we call CAS, for Coded Atomic Storage. This algorithm uses erasure coding methods. (2) In a storage system with N servers that is resilient to f server failures, we show that the communication costs for the ABD and LDR algorithms, measured in terms of number of object values, are both at least f + 1, whereas the communication cost for CAS is N/(N-2f). (3) We also explicitly quantify the storage costs of the ABD, LDR, and CAS algorithms. The storage cost of the ABD algorithm, measured in terms of number of object values, is N; whereas the storage costs of the LDR and CAS algorithms are both unbounded. We present a modification of the CAS algorithm based on the idea of garbage collection. The modified version of CAS has a storage cost of (d + 1) N/(N-2f), where d in an upper bound on the number of operations that are concurrent with a read operation. Thus, if d is sufficiently small, the storage cost of CAS is lower than those of both the ABD and LDR algorithms.

Dynamic Input/Output Automata: a Formal and Compositional Model for Dynamic Systems

2013年7月08日 00:00:00 GMT

Dynamic Input/Output Automata: a Formal and Compositional Model for Dynamic Systems Attie, Paul C.; Lynch, Nancy A. We present dynamic I/O automata (DIOA), a compositional model of dynamic systems, based on I/O automata. In our model, automata can be created and destroyed dynamically, as computation proceeds. In addition, an automaton can dynamically change its signature, that is, the set of actions in which it can participate. This allows us to model mobility, by enforcing the constraint that only automata at the same location may synchronize on common actions. Our model features operators for parallel composition, action hiding, and action renaming. It also features a notion of automaton creation, and a notion of trace inclusion from one dynamic system to another, which can be used to prove that one system implements the other. Our model is hierarchical: a dynamically changing system of interacting automata is itself modeled as a single automaton that is "one level higher." This can be repeated, so that an automaton that represents such a dynamic system can itself be created and destroyed. We can thus model the addition and removal of entire subsystems with a single action. We establish fundamental compositionality results for DIOA: if one component is replaced by another whose traces are a subset of the former, then the set of traces of the system as a whole can only be reduced, and not increased, i.e., no new behaviors are added. That is, parallel composition, action hiding, and action renaming, are all monotonic with respect to trace inclusion. We also show that, under certain technical conditions, automaton creation is monotonic with respect to trace inclusion: if a system creates automaton Ai instead of (previously) creating automaton A'i, and the traces of Ai are a subset of the traces of A'i,then the set of traces of the overall system is possibly reduced, but not increased. Our trace inclusion results imply that trace equivalence is a congruence relation with respect to parallel composition, action hiding, and action renaming. Our trace inclusion results enable a design and refinement methodology based solely on the notion of externally visible behavior, and which is therefore independent of specific methods of establishing trace inclusion. It permits the refinement of components and subsystems in isolation from the entire system, and provides more flexibility in refinement than a methodology which is, for example, based on the monotonicity of forward simulation with respect to parallel composition. In the latter, every automaton must be refined using forward simulation, whereas in our framework different automata can be refined using different methods. The DIOA model was defined to support the analysis of mobile agent systems, in a joint project with researchers at Nippon Telegraph and Telephone. It can also be used for other forms of dynamic systems, such as systems described by means of object-oriented programs, and systems containing services with changing access permissions.

Verifying Quantitative Reliability of Programs That Execute on Unreliable Hardware

2013年6月19日 00:00:00 GMT

Verifying Quantitative Reliability of Programs That Execute on Unreliable Hardware Carbin, Michael; Misailovic, Sasa; Rinard, Martin Emerging high-performance architectures are anticipated to contain unreliable components that may exhibit soft errors, which silently corrupt the results of computations. Full detection and recovery from soft errors is challenging, expensive, and, for some applications, unnecessary. For example, approximate computing applications (such as multimedia processing, machine learning, and big data analytics) can often naturally tolerate soft errors. In this paper we present Rely, a programming language that enables developers to reason about the quantitative reliability of an application -- namely, the probability that it produces the correct result when executed on unreliable hardware. Rely allows developers to specify the reliability requirements for each value that a function produces. We present a static quantitative reliability analysis that verifies quantitative requirements on the reliability of an application, enabling a developer to perform sound and verified reliability engineering. The analysis takes a Rely program with a reliability specification and a hardware specification, that characterizes the reliability of the underlying hardware components, and verifies that the program satisfies its reliability specification when executed on the underlying unreliable hardware platform. We demonstrate the application of quantitative reliability analysis on six computations implemented in Rely.

Body-form and body-pose recognition with a hierarchical model of the ventral stream

2013年6月18日 00:00:00 GMT

Body-form and body-pose recognition with a hierarchical model of the ventral stream Kim, Heejung; Wohlwend, Jeremy; Leibo, Joel Z.; Poggio, Tomaso When learning to recognize a novel body shape, e.g., a panda bear, we are not misled by changes in its pose. A "jumping panda bear" is readily recognized, despite having no prior visual experience with the conjunction of these concepts. Likewise, a novel pose can be estimated in an invariant way, with respect to the actor's body shape. These body and pose recognition tasks require invariance to non-generic transformations that previous models of the ventral stream do not have. We show that the addition of biologically plausible, class-specific mechanisms associating previously-viewed actors in a range of poses enables a hierarchical model of object recognition to account for this human capability. These associations could be acquired in an unsupervised manner from past experience.

Reactive Integrated Motion Planning and Execution Using Chekhov

2013年6月06日 00:00:00 GMT

Reactive Integrated Motion Planning and Execution Using Chekhov Shroff, Ameya We envision a world in which robots and humans can collaborate to perform complex tasks in real-world environments. Current motion planners successfully generate trajectories for a robot with multiple degrees of freedom, in a cluttered environment, and ensure that the robot can achieve its goal while avoiding all the obstacles in the environment. However, these planners are not practical in real world scenarios that involve unstructured, dynamic environments for a three primary reasons. First, these motion planners assume that the environment the robot is functioning in, is well-known and static, both during plan generation and plan execution. Second, these planners do not support temporal constraints, which are crucial for planning in a rapidly-changing environment and for allowing task synchronisation between the robot and other agents, like a human or even another robot. Third, the current planners do not adequately represent the requirements of the task. They often over-constrain the task description and are hence unable to take advantage of task flexibility which may aid in optimising energy efficiency or robustness. In this thesis we present Chekhov, a reactive, integrated motion planning and execution executive that addresses these shortcomings using four key innovations. First, unlike traditional planners, the planning and execution components of Chekhov are very closely integrated. This close coupling blurs the traditional, sharp boundary between the two components and allows for optimal collaboration. Second, Chekhov represents temporal constraints, which allows it to perform operations that are temporally synchronised with external events. Third, Chekhov uses an incremental search algorithm which allows it to rapidly generate a new plan if a disturbance is encountered that threatens the execution of the existing plan. Finally, unlike standard planners which generate a single reference trajectory from the start pose to the goal pose, Chekhov generates a Qualitative Control Plan using Flow Tubes that represent families of feasible trajectories and associated control policies. These flow tubes provide Chekhov with a flexibility that is extremely valuable and serve as Chekhov's first line of defence. MEng thesis

A Publish-Subscribe Implementation of Network Management

2013年6月04日 00:00:00 GMT

A Publish-Subscribe Implementation of Network Management Simosa, Jorge D. As modern networks become highly integrated, heterogeneous, and experience exponential growth, the task of network management becomes increasingly unmanageable for network administrators and designers. The Knowledge Plane (KP) is designed to support a self-managing network, given the organizational constraints of network management, as well as to create synergy and exploit commonality among network applications. In this thesis, to build an Information Plane that is suitable to the requirements of the KP, we propose a publish/subscribe system that provides a clear and systematic framework for resolving tussles in the network. To evaluate the effectiveness of this design, we configured a network of PlanetLab nodes and conducted experiments involving a variety of file sizes and source-destination pairs. The results suggest that the system's performance is not only comparable to existing file transfer services, but that the system also introduces several performance gains that are unattainable with current network architectures. MEng thesis

BigBand: GHz-Wide Sensing and Decoding on Commodity Radios

2013年5月22日 00:00:00 GMT

BigBand: GHz-Wide Sensing and Decoding on Commodity Radios Hassanieh, Haitham; Shi, Lixin; Abari, Omid; Hamed, Ezzeldine; Katabi, Dina The goal of this paper is to make sensing and decoding GHz of spectrum simple, cheap, and low power. Our thesis is simple: if we can build a technology that captures GHz of spectrum using commodity Wi-Fi radios, it will have the right cost and power budget to enable a variety of new applications such as GHz-widedynamic access and concurrent decoding of diverse technologies. This vision will change today s situation where only expensive power-hungry spectrum analyzers can capture GHz-wide spectrum. Towards this goal, the paper harnesses the sparse Fourier transform to compute the frequency representation of a sparse signal without sampling it at full bandwidth. The paper makes the following contributions. First, it presents BigBand, a receiver that can sense and decode a sparse spectrum wider than its own digital bandwidth. Second, it builds a prototype of its design using 3 USRPs that each samples the spectrum at 50 MHz, producing a device that captures 0.9 GHz -- i.e., 6x larger bandwidth than the three USRPs combined. Finally, it extends its algorithm to enable spectrum sensing in scenarios where the spectrum is not sparse.

Organon: A Symbolic Constraint Framework & Solver

2013年5月24日 00:00:00 GMT

Organon: A Symbolic Constraint Framework & Solver Evans, Isaac; Lynch, Joseph Organon is an open source system for expressing and solving complex symbolic constraints between generic entities. Our design avoids restricting the programmer s ability to phrase constraints; Organon acts purely as a framework that defines and holds together the key concepts of forms, constraints, and solvers. It has three main components: (1) Forms: Abstract representations of the entities to be constrained. (2) Constraints: Functions that symbolically express requirements on the relationships between forms as well as provide information a solver can use to improve the constraint s satisfaction. (3) Solvers: Functions which inspect instantiations of forms and manipulate them in an attempt to satisfy a set of objective constraints.

High Spatial Resolution BRDFs with Metallic powders Using Wave Optics Analysis

2013年4月24日 00:00:00 GMT

High Spatial Resolution BRDFs with Metallic powders Using Wave Optics Analysis Levin, Anat; Glasner, Daniel; Xiong, Ying; Durand, Fredo; Freeman, William; Matusik, Wojciech; Zickler, Todd This manuscript completes the analysis of our SIGGRAPH 2013 paper "Fabricating BRDFs at High Spatial Resolution Using Wave Optics" in which photolithography fabrication was used for manipulating reflectance effects. While photolithography allows for precise reflectance control, it is costly to fabricate. Here we explore an inexpensive alternative to micro-fabrication, in the form of metallic powders. Such powders are readily available at a variety of particle sizes and morphologies. Using an analysis similar to the micro-fabrication paper, we provide guidelines for the relation between the particles' shape and size and the reflectance functions they can produce.

Compositional Policy Priors

2013年4月12日 00:00:00 GMT

Compositional Policy Priors Wingate, David; Diuk, Carlos; O'Donnell, Timothy; Tenenbaum, Joshua; Gershman, Samuel This paper describes a probabilistic framework for incorporating structured inductive biases into reinforcement learning. These inductive biases arise from policy priors, probability distributions over optimal policies. Borrowing recent ideas from computational linguistics and Bayesian nonparametrics, we define several families of policy priors that express compositional, abstract structure in a domain. Compositionality is expressed using probabilistic context-free grammars, enabling a compact representation of hierarchically organized sub-tasks. Useful sequences of sub-tasks can be cached and reused by extending the grammars nonparametrically using Fragment Grammars. We present Monte Carlo methods for performing inference, and show how structured policy priors lead to substantially faster learning in complex domains compared to methods without inductive biases.

Task-Structured Probabilistic I/O Automata

2009年1月01日 00:00:00 GMT

Task-Structured Probabilistic I/O Automata Canetti, Ran; Cheung, Ling; Kaynar, Dilsun; Liskov, Moses; Lynch, Nancy; Pereira, Olivier; Segala, Roberto Modeling frameworks such as Probabilistic I/O Automata (PIOA) and Markov Decision Processes permit both probabilistic and nondeterministic choices. In order to use these frameworks to express claims about probabilities of events, one needs mechanisms for resolving nondeterministic choices. For PIOAs, nondeterministic choices have traditionally been resolved by schedulers that have perfect information about the past execution. However, these schedulers are too powerful for certain settings, such as cryptographic protocol analysis, where information must sometimes be hidden. Here, we propose a new, less powerful nondeterminism-resolution mechanism for PIOAs, consisting of tasks and local schedulers. Tasks are equivalence classes of system actions that are scheduled by oblivious, global task sequences. Local schedulers resolve nondeterminism within system components, based on local information only. The resulting task-PIOA framework yields simple notions of external behavior and implementation, and supports simple compositionality results. We also define a new kind of simulation relation, and show it to be sound for proving implementation. We illustrate the potential of the task-PIOAframework by outlining its use in verifying an Oblivious Transfer protocol. "May 28, 2009."

Tracking 3-D Rotations with the Quaternion Bingham Filter

2013年3月27日 00:00:00 GMT

Tracking 3-D Rotations with the Quaternion Bingham Filter Glover, Jared; Kaelbling, Leslie Pack A deterministic method for sequential estimation of 3-D rotations is presented. The Bingham distribution is used to represent uncertainty directly on the unit quaternion hypersphere. Quaternions avoid the degeneracies of other 3-D orientation representations, while the Bingham distribution allows tracking of large-error (high-entropy) rotational distributions. Experimental comparison to a leading EKF-based filtering approach on both synthetic signals and a ball-tracking dataset shows that the Quaternion Bingham Filter (QBF) has lower tracking error than the EKF, particularly when the state is highly dynamic. We present two versions of the QBF, suitable for tracking the state of first- and second-order rotating dynamical systems.

Faces as a "Model Category" for Visual Object Recognition

2013年3月18日 00:00:00 GMT

Faces as a "Model Category" for Visual Object Recognition Tan, Cheston; Poggio, Tomaso Visual recognition is an important ability that is central to many everyday tasks such as reading, navigation and social interaction, and is therefore actively studied in neuroscience, cognitive psychology and artificial intelligence. There exist thousands of object categories, all of which pose similar challenges to biological and artificial visual systems: accurate recognition under varying location, scale, view angle, illumination and clutter. In many areas of science, important discoveries have been made using "model organisms" such as fruit flies, mice and macaques. For the thousands of object categories, the important and well-studied category of faces could potentially serve as a "model category" upon which efforts are focused, and from which fundamental insights are drawn. However, it has been hotly debated whether faces are processed by the brain in a manner fundamentally different from other categories. Here we show that "neural tuning size" -- a single parameter in a computational model of object processing -- is able to account for important face-specific phenomena. Thus, surprisingly, "face-like" processing is explainable by physiological mechanisms that differ only quantitatively from "object-like" processing. Our computational proof-of-principle provides specific neural tuning properties that correspond to the so-far qualitative and controversial notion of "holistic" face processing. Overall, faces may be a viable model category. Since faces are highly amenable to complementary experimental techniques like functional MRI, electrophysiology, electroencephalography and transcranial magnetic stimulation, this further raises the odds that the algorithms and neural circuits underlying visual recognition may first be solved for faces. With faces serving as a model category, the great scientific challenge of understanding and reverse-engineering general visual recognition can be greatly accelerated.

A Plan for Optimizing Network-Intensive Cloud Applications

2013年2月12日 00:00:00 GMT

A Plan for Optimizing Network-Intensive Cloud Applications LaCurts, Katrina; Deng, Shuo; Balakrishnan, Hari A significant and growing number of applications deployed on cloud infrastructures are network-intensive. These applications are frequently bottlenecked by the speed of network connections between the machines on which they are deployed. Due to the complexity and size of cloud networks, such applications often run slowly or have unpredictable completion times and/or throughput, both of which can result in increased cost to the customer. In this paper, we argue that cloud customers should be able to express the demands and objectives of their applications. We outline an architecture that allows for this type of expression, and distributes applications within the cloud network such that the application's objectives are met. We discuss some of the key questions that need to be addressed to implement the architecture, as well as the interactions between optimizations done by clients and by cloud providers. We also present preliminary results that indicate that these types of systems are feasible and improve performance.

Asynchronous Failure Detectors

2013年1月30日 00:00:00 GMT

Securing Deployed RFIDs by Randomizing the Modulation and the Channel

2013年1月12日 00:00:00 GMT

Securing Deployed RFIDs by Randomizing the Modulation and the Channel Wang, Jue; Hassanieh, Haitham; Katabi, Dina; Kohno, Tadayoshi RFID cards are widely used today in sensitive applications such as access control, payment systems, and asset tracking. Past work shows that an eavesdropper snooping on the communication between a card and its legitimate reader can break their cryptographic protocol and obtain their secret keys. One solution for this problem is to install stronger cryptographic protocols on the cards. However, RFIDs' size, power, and cost limitations do not allow for conventional cryptographic protocols. Further, installing new protocols requires revoking billions of cards in consumers hands and facilities worldwide, which is costly and impractical. In this paper, we ask whether one can secure RFIDs from such attacks without revoking or changing the insecure cards. We propose LocRF, a solution that changes the signal used to read the RFID cards but does not require any changes to the cards themselves. LocRF introduces a new approach that randomizes the modulation of the RFID signal as well as the wireless channel. This design protects RFIDs from eavesdroppers even if they use multi-antenna MIMO receivers. We built a prototype of LocRF on software-defined radios and used it to secure the communication of off-the-shelf cards. Both our analysis and empirical evaluation demonstrate theeffectiveness of LocRF.

The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work).

2012年12月29日 00:00:00 GMT

The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work). Poggio, Tomaso; Mutch, Jim; Leibo, Joel; Rosasco, Lorenzo; Tacchetti, Andrea This paper explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream -- from V1, V2, V4 and to IT -- is to discount image transformations, after learning them during development.

5D Covariance Tracing for Efficient Defocus and Motion Blur

2012年11月16日 00:00:00 GMT

5D Covariance Tracing for Efficient Defocus and Motion Blur Belcour, Laurent; Soler, Cyril; Subr, Kartic; Holzschuch, Nicolas; Durand, Fredo The rendering of effects such as motion blur and depth-of-field requires costly 5D integrals. We dramatically accelerate their computation through adaptive sampling and reconstruction based on the prediction of the anisotropy and bandwidth of the integrand. For this, we develop a new frequency analysis of the 5D temporal light-field, and show that first-order motion can be handled through simple changes of coordinates in 5D. We further introduce a compact representation of the spectrum using the co- variance matrix and Gaussian approximations. We derive update equations for the 5 ×ばつ 5 covariance matrices for each atomic light transport event, such as transport, occlusion, BRDF, texture, lens, and motion. The focus on atomic operations makes our work general, and removes the need for special-case formulas. We present a new rendering algorithm that computes 5D covariance matrices on the image plane by tracing paths through the scene, focusing on the single-bounce case. This allows us to reduce sampling rates when appropriate and perform reconstruction of images with complex depth-of-field and motion blur effects.

Monitoring the Execution of Temporal Plans for Robotic Systems

2012年10月04日 00:00:00 GMT

Monitoring the Execution of Temporal Plans for Robotic Systems Levine, Steven J. To achieve robustness in dynamic and uncertain environments, robotic systems must monitor the progress of their plans during execution. This thesis develops a plan executive called Pike that is capable of executing and monitoring plans. The execution monitor at its core quickly and efficiently detects relevant disturbances that threaten future actions in the plan. We present a set of novel offline algorithms that extract sets of candidate causal links from temporally-flexible plans. A second set of algorithms uses these causal links to monitor the execution online and detect problems with low latency. We additionally introduce the TBurton executive, a system capable of robustly meeting a user s high-level goals through the combined use of Pike and a temporal generative planner. An innovative voice-commanded robot is demonstrated in hardware and simulation that robustly meets high level goals and verbalizes any causes of failure using the execution monitor MEng thesis

A Gaussian Approximation of Feature Space for Fast Image Similarity

2012年10月01日 00:00:00 GMT

A Gaussian Approximation of Feature Space for Fast Image Similarity Gharbi, Michael; Malisiewicz, Tomasz; Paris, Sylvain; Durand, Frédo We introduce a fast technique for the robust computation of image similarity. It builds on a re-interpretation of the recent exemplar-based SVM approach, where a linear SVM is trained at a query point and distance is computed as the dot product with the normal to the separating hyperplane. Although exemplar-based SVM is slow because it requires a new training for each exemplar, the latter approach has shown robustness for image retrieval and object classification, yielding state-of- the-art performance on the PASCAL VOC 2007 detection task despite its simplicity. We re-interpret it by viewing the SVM between a single point and the set of negative examples as the computation of the tangent to the manifold of images at the query. We show that, in a high-dimensional space such as that of image features, all points tend to lie at the periphery and that they are usually separable from the rest of the set. We then use a simple Gaussian approximation to the set of all images in feature space, and fit it by computing the covariance matrix on a large training set. Given the covariance matrix, the computation of the tangent or normal at a point is straightforward and is a simple multiplication by the inverse covariance. This allows us to dramatically speed up image retrieval tasks, going from more than ten minutes to a single second. We further show that our approach is equivalent to feature-space whitening and has links to image saliency.

Robust Tracking for Real-Time Dense RGB-D Mapping with Kintinuous

2012年9月17日 00:00:00 GMT

Robust Tracking for Real-Time Dense RGB-D Mapping with Kintinuous Whelan, Thomas; Johannsson, Hordur; Kaess, Michael; Leonard, John J.; McDonald, John This paper describes extensions to the Kintinuous algorithm for spatially extended KinectFusion, incorporating the following additions: (i) the integration of multiple 6DOF camera odometry estimation methods for robust tracking; (ii) a novel GPU-based implementation of an existing dense RGB-D visual odometry algorithm; (iii) advanced fused real-time surface coloring. These extensions are validated with extensive experimental results, both quantitative and qualitative, demonstrating the ability to build dense fully colored models of spatially extended environments for robotics and virtual reality applications while remaining robust against scenes with challenging sets of geometric and visual features.

Aeolus Reference Manual

2012年9月14日 00:00:00 GMT

Aeolus Reference Manual Liskov, Barbara This document describes the interface that the Aeolus information flow platform provides for users who are implementing applications using Java. The document explains how the Aeolus features are made available by means of a Java library.

Multiscale Geometric Methods for Data Sets I: Multiscale SVD, Noise and Curvature

2012年9月08日 00:00:00 GMT

Multiscale Geometric Methods for Data Sets I: Multiscale SVD, Noise and Curvature Little, Anna V.; Maggioni, Mauro; Rosasco, Lorenzo Large data sets are often modeled as being noisy samples from probability distributions in R^D, with D large. It has been noticed that oftentimes the support M of these probability distributions seems to be well-approximated by low-dimensional sets, perhaps even by manifolds. We shall consider sets that are locally well approximated by k-dimensional planes, with k << D, with k-dimensional manifolds isometrically embedded in R^D being a special case. Samples from this distribution; are furthermore corrupted by D-dimensional noise. Certain tools from multiscale geometric measure theory and harmonic analysis seem well-suited to be adapted to the study of samples from such probability distributions, in order to yield quantitative geometric information about them. In this paper we introduce and study multiscale covariance matrices, i.e. covariances corresponding to the distribution restricted to a ball of radius r, with a fixed center and varying r, and under rather general geometric assumptions we study how their empirical, noisy counterparts behave. We prove that in the range of scales where these covariance matrices are most informative, the empirical, noisy covariances are close to their expected, noiseless counterparts. In fact, this is true as soon as the number of samples in the balls where the covariance matrices are computed is linear in the intrinsic dimension of M. As an application, we present an algorithm for estimating the intrinsic dimension of M.

A Social-Welfare Optimal Probabilistic Mechanism for Knightian Single-Good Auctions

2012年9月07日 00:00:00 GMT

A Social-Welfare Optimal Probabilistic Mechanism for Knightian Single-Good Auctions Chiesa, Alessandro; Micali, Silvio; Zhu, Zeyuan Allen We provide an optimal probabilistic mechanism for maximizing social welfare in single-good auctions when each player does not know his true valuation for the good, but only a set of valuations that is guaranteed to include his true one.

From Formal Methods to Executable Code

2012年8月27日 00:00:00 GMT

From Formal Methods to Executable Code Musial, Peter M. The objective of this work is the derivation of software that is verifiably correct. Our approach is to abstract system specifications and model these in a formal framework called Timed Input/Output Automata, which provides a notation for expressing distributed systems and mathematical support for reasoning about their properties. Although formal reasoning is easier at an abstract level, it is not clear how to transform these abstractions into executable code. During system implementation, when an abstract system specification is left up to human interpretation, then this opens a possibility of undesirable behaviors being introduced into the final code, thereby nullifying all formal efforts. This manuscript addresses this issue and presents a set of transformation methods for systems described as a network to timed automata into Java code for distributed platforms. We prove that the presented transformation methods preserve guarantees of the source specifications, and therefore, result in code that is correct by construction. Note: the cover page of this report shows an incorrect title. The title given on the first page of the document itself is correct.

Bounded-Contention Coding for Wireless Networks in the High SNR Regime

2012年8月27日 00:00:00 GMT

Bounded-Contention Coding for Wireless Networks in the High SNR Regime Censor-Hillel, Keren; Haeupler, Bernhard; Lynch, Nancy; Medard, Muriel Efficient communication in wireless networks is typically challenged by the possibility of interference among several transmitting nodes. Much important research has been invested in decreasing the number of collisions in order to obtain faster algorithms for communication in such networks. This paper proposes a novel approach for wireless communication, which embraces collisions rather than avoiding them, over an additive channel. It introduces a coding technique called Bounded-Contention Coding (BCC) that allows collisions to be successfully decoded by the receiving nodes into the original transmissions and whose complexity depends on a bound on the contention among the transmitters. BCC enables deterministic local broadcast in a network with n nodes and at most a transmitters with information of L bits each within O(a log n + aL) bits of communication with full-duplex radios, and O((a log n + aL)(log n)) bits, with high probability, with half-duplex radios. When combined with random linear network coding, BCC gives global broadcast within O((D + a + log n)(a log n + L)) bits, with high probability. This also holds in dynamic networks that can change arbitrarily over time by a worst-case adversary. When no bound on the contention is given, it is shown how to probabilistically estimate it and obtain global broadcast that is adaptive to the true contention in the network.

Using Program Synthesis for Social Recommendations

2012年8月13日 00:00:00 GMT

Using Program Synthesis for Social Recommendations Cheung, Alvin; Solar-Lezama, Armando; Madden, Samuel This paper presents a new approach to select events of interest to a user in a social media setting where events are generated by the activities of the user's friends through their mobile devices. We argue that given the unique requirements of the social media setting, the problem is best viewed as an inductive learning problem, where the goal is to first generalize from the users' expressed "likes" and "dislikes" of specific events, then to produce a program that can be manipulated by the system and distributed to the collection devices to collect only data of interest. The key contribution of this paper is a new algorithm that combines existing machine learning techniques with new program synthesis technology to learn users' preferences. We show that when compared with the more standard approaches, our new algorithm provides up to order-of-magnitude reductions in model training time, and significantly higher prediction accuracies for our target application. The approach also improves on standard machine learning techniques in that it produces clear programs that can be manipulated to optimize data collection and filtering.

The Order Independence of Iterated Dominance in Extensive Games, with Connections to Mechanism Design and Backward Induction

2012年7月31日 00:00:00 GMT

The Order Independence of Iterated Dominance in Extensive Games, with Connections to Mechanism Design and Backward Induction Chen, Jing; Micali, Silvio Shimoji and Watson (1998) prove that a strategy of an extensive game is rationalizable in the sense of Pearce if and only if it survives the maximal elimination of conditionally dominated strategies. Briefly, this process iteratively eliminates conditionally dominated strategies according to a specific order, which is also the start of an order of elimination of weakly dominated strategies. Since the final set of possible payoff profiles, or terminal nodes, surviving iterated elimination of weakly dominated strategies may be order-dependent, one may suspect that the same holds for conditional dominance. We prove that, although the sets of strategy profiles surviving two arbitrary elimination orders of conditional dominance may be very different from each other, they are equivalent in the following sense: for each player i and each pair of elimination orders, there exists a function phi_i mapping each strategy of i surviving the first order to a strategy of i surviving the second order, such that, for every strategy profile s surviving the first order, the profile (phi_i(s_i))_i induces the same terminal node as s does. To prove our results we put forward a new notion of dominance and an elementary characterization of extensive-form rationalizability (EFR) that may be of independent interest. We also establish connections between EFR and other existing iterated dominance procedures, using our notion of dominance and our characterization of EFR.

Patch complexity, finite pixel correlations and optimal denoising

2012年10月07日 00:00:00 GMT

Patch complexity, finite pixel correlations and optimal denoising Levin, Anat; Nadler, Boaz; Durand, Fredo; Freeman, William T. Image restoration tasks are ill-posed problems, typically solved withpriors. Since the optimal prior is the exact unknown density of natural images,actual priors are only approximate and typically restricted to small patches. Thisraises several questions: How much may we hope to improve current restorationresults with future sophisticated algorithms? And more fundamentally, even withperfect knowledge of natural image statistics, what is the inherent ambiguity ofthe problem? In addition, since most current methods are limited to finite supportpatches or kernels, what is the relation between the patch complexity of naturalimages, patch size, and restoration errors? Focusing on image denoising, we makeseveral contributions. First, in light of computational constraints, we study the relation between denoising gain and sample size requirements in a non parametricapproach. We present a law of diminishing return, namely that with increasingpatch size, rare patches not only require a much larger dataset, but also gain littlefrom it. This result suggests novel adaptive variable-sized patch schemes for denoising. Second, we study absolute denoising limits, regardless of the algorithmused, and the converge rate to them as a function of patch size. Scale invarianceof natural images plays a key role here and implies both a strictly positive lowerbound on denoising and a power law convergence. Extrapolating this parametriclaw gives a ballpark estimate of the best achievable denoising, suggesting thatsome improvement, although modest, is still possible.

Viewstamped Replication Revisited

2012年7月23日 00:00:00 GMT

Viewstamped Replication Revisited Liskov, Barbara; Cowling, James This paper presents an updated version of Viewstamped Replication, a replication technique that handles failures in which nodes crash. It describes how client requests are handled, how the group reorganizes when a replica fails, and how a failed replica is able to rejoin the group. The paper also describes a number of important optimizations and presents a protocol for handling reconfigurations that can change both the group membership and the number of failures the group is able to handle.

Kintinuous: Spatially Extended KinectFusion

2012年7月19日 00:00:00 GMT

Kintinuous: Spatially Extended KinectFusion Whelan, Thomas; Kaess, Michael; Fallon, Maurice; Johannsson, Hordur; Leonard, John; McDonald, John In this paper we present an extension to the KinectFusion algorithm that permits dense mesh-based mapping of extended scale environments in real-time. This is achieved through (i) altering the original algorithm such that the region of space being mapped by the KinectFusion algorithm can vary dynamically, (ii) extracting a dense point cloud from the regions that leave the KinectFusion volume due to this variation, and, (iii) incrementally adding the resulting points to a triangular mesh representation of the environment. The system is implemented as a set of hierarchical multi-threaded components which are capable of operating in real-time. The architecture facilitates the creation and integration of new modules with minimal impact on the performance on the dense volume tracking and surface reconstruction modules. We provide experimental results demonstrating the system's ability to map areas considerably beyond the scale of the original KinectFusion algorithm including a two story apartment and an extended sequence taken from a car at night. In order to overcome failure of the iterative closest point (ICP) based odometry in areas of low geometric features we have evaluated the Fast Odometry from Vision (FOVIS) system as an alternative. We provide a comparison between the two approaches where we show a trade off between the reduced drift of the visual odometry approach and the higher local mesh quality of the ICP-based approach. Finally we present ongoing work on incorporating full simultaneous localisation and mapping (SLAM) pose-graph optimisation.

Integrated robot task and motion planning in belief space

2012年7月03日 00:00:00 GMT

Integrated robot task and motion planning in belief space Kaelbling, Leslie Pack; Lozano-Perez, Tomas In this paper, we describe an integrated strategy for planning, perception, state-estimation and action in complex mobile manipulation domains. The strategy is based on planning in the belief space of probability distribution over states. Our planning approach is based on hierarchical goal regression (pre-image back-chaining). We develop a vocabulary of fluents that describe sets of belief states, which are goals and subgoals in the planning process. We show that a relatively small set of symbolic operators lead to task-oriented perception in support of the manipulation goals. An implementation of this method is demonstrated in simulation and on a real PR2 robot, showing robust, flexible solution of mobile manipulation problems with multiple objects and substantial uncertainty.

Integrated Robot Task and Motion Planning in the Now

2012年6月29日 00:00:00 GMT

Integrated Robot Task and Motion Planning in the Now Kaelbling, Leslie Pack; Lozano-Perez, Tomas This paper provides an approach to integrating geometric motion planning with logical task planning for long-horizon tasks in domains with many objects. We propose a tight integration between the logical and geometric aspects of planning. We use a logical representation which includes entities that refer to poses, grasps, paths and regions, without the need for a priori discretization. Given this representation and some simple mechanisms for geometric inference, we characterize the pre-conditions and effects of robot actions in terms of these logical entities. We then reason about the interaction of the geometric and non-geometric aspects of our domains using the general-purpose mechanism of goal regression (also known as pre-image backchaining). We propose an aggressive mechanism for temporal hierarchical decomposition, which postpones the pre-conditions of actions to create an abstraction hierarchy that both limits the lengths of plans that need to be generated and limits the set of objects relevant to each plan. We describe an implementation of this planning method and demonstrate it in a simulated kitchen environment in which it solves problems that require approximately 100 individual pick or place operations for moving multiple objects in a complex domain.

Epistemic Implementation and The Arbitrary-Belief Auction

2012年6月22日 00:00:00 GMT

Epistemic Implementation and The Arbitrary-Belief Auction Chen, Jing; Micali, Silvio; Pass, Rafael In settings of incomplete information we put forward an epistemic framework for designing mechanisms that successfully leverage the players' arbitrary higher-order beliefs, even when such beliefs are totally wrong, and even when the players are rational in a very weak sense. Following Aumann (1995), we consider a player i rational if he uses a pure strategy s_i such that no alternative pure strategy s_i' performs better than s_i in every world i considers possible, and consider him order-k rational if he is rational and believes that all other players are order-(k-1) rational. We then introduce an iterative deletion procedure of dominated strategies and use it to precisely characterize the strategies consistent with the players being order-k rational. We exemplify the power of our framework in single-good auctions by introducing and achieving a new class of revenue benchmarks, defined over the players' arbitrary beliefs, that can be much higher than classical ones, and are unattainable by traditional mechanisms. Namely, we exhibit a mechanism that, for every k greater than or equal to 0 and epsilon>0 and whenever the players are order-(k+1) rational, guarantees revenue greater than or equal to G^k-epsilon, where G^k is the second highest belief about belief about ... (k times) about the highest valuation of some player, even when such a player's identity is not precisely known. Importantly, our mechanism is possibilistic interim individually rational. Essentially this means that, based on his beliefs, a player's utility is non-negative not in expectation, but in each world he believes possible. We finally show that our benchmark G^k is so demanding that it separates the revenue achievable with order-k rational players from that achievable with order-(k+1) rational ones. That is, no possibilistic interim individually rational mechanism can guarantee revenue greater than or equal to G^k-c, for any constant c>0, when the players are only order-k rational.

Throwing Down the Visual Intelligence Gauntlet

2012年1月01日 00:00:00 GMT

Throwing Down the Visual Intelligence Gauntlet Tan, Cheston; Leibo, Joel Z; Poggio, Tomaso In recent years, scientific and technological advances have produced artificial systems that have matched or surpassed human capabilities in narrow domains such as face detection and optical character recognition. However, the problem of producing truly intelligent machines still remains far from being solved. In this chapter, we first describe some of these recent advances, and then review one approach to moving beyond these limited successes---the neuromorphic approach of studying and reverse-engineering the networks of neurons in the human brain (specifically, the visual system). Finally, we discuss several possible future directions in the quest for visual intelligence.

Optimal Parametric Auctions

2012年6月14日 00:00:00 GMT

Optimal Parametric Auctions Azar, Pablo Daniel; Micali, Silvio We study the problem of an auctioneer who wants to maximize her profits. In our model, there are n buyers with private valuations drawn from independent distributions F_1,...,F_n. When these distributions are known to the seller, Myerson's optimal auction is a well known mechanism that maximizes revenue. However, in many cases it is too strong to assume that the seller knows these distributions. We propose an alternative model where the seller only knows the mean mu_i and variance sigma_i^2 of each distribution F_i. We call mechanisms that only use this information parametric auctions. We construct such auctions for all single-dimensional downward closed environments. For a very large class of distributions, including (but not limited to) distributions with a monotone hazard rate, our auctions achieve a constant fraction of the revenue of Myerson's auction. When the seller has absolutely no knowledge about the distributions, it is well known that no auction can achieve a constant fraction of the optimal revenue when the players are not identically distributed. Our parametric model gives the seller a small amount of extra information, allowing her to construct auctions for which (1) she does not know the full distribution of valuations, (2) no two bidders need to be drawn from identical distributions and (3) the revenue obtained is a constant fraction of the revenue in Myerson's optimal auction. For digital goods environments we present a different parametric auction that not only gives a better approximation to the optimal auction, but that is also optimal in a new sense, which we call maximin optimality. Informally, an auction is maximin optimal if it maximizes revenue in the worst case over an adversary's choice of the distribution. We show that our digital parametric is maximin optimal among the class of posted price mechanisms.

The Levels of Understanding framework, revised

2012年5月31日 00:00:00 GMT

The Levels of Understanding framework, revised Poggio, Tomaso I discuss the "levels of understanding" framework described in Marr's Vision and propose a revised and updated version of it to capture the changes in computation and neuroscience over the last 30 years.

Temporally Scalable Visual SLAM using a Reduced Pose Graph

2012年5月25日 00:00:00 GMT

Temporally Scalable Visual SLAM using a Reduced Pose Graph Johannsson, Hordur; Kaess, Michael; Fallon, Maurice; Leonard, John J. In this paper, we demonstrate a system for temporally scalable visual SLAM using a reduced pose graph representation. Unlike previous visual SLAM approaches that use keyframes, our approach continually uses new measurements to improve the map, yet achieves efficiency by avoiding adding redundant frames and not using marginalization to reduce the graph. To evaluate our approach, we present results using an online binocular visual SLAM system that uses place recognition for both robustness and multi-session operation. To allow large-scale indoor mapping, our system automatically handles elevator rides based on accelerometer data. We demonstrate long-term mapping in a large multi-floor building, using approximately nine hours of data collected over the course of six months. Our results illustrate the capability of our visual SLAM system to scale in size with the area of exploration instead of the time of exploration.

A Case for Fine-Grain Adaptive Cache Coherence

2012年5月22日 00:00:00 GMT

A Case for Fine-Grain Adaptive Cache Coherence Kurian, George; Khan, Omer; Devadas, Srinivas As transistor density continues to grow geometrically, processor manufacturers are already able to place a hundred cores on a chip (e.g., Tilera TILE-Gx 100), with massive multicore chips on the horizon. Programmers now need to invest more effort in designing software capable of exploiting multicore parallelism. The shared memory paradigm provides a convenient layer of abstraction to the programmer, but will current memory architectures scale to hundreds of cores? This paper directly addresses the question of how to enable scalable memory systems for future multicores. We develop a scalable, efficient shared memory architecture that enables seamless adaptation between private and logically shared caching at the fine granularity of cache lines. Our data-centric approach relies on in hardware runtime profiling of the locality of each cache line and only allows private caching for data blocks with high spatio-temporal locality. This allows us to better exploit on-chip cache capacity and enable low-latency memory access in large-scale multicores.

Optimal Parametric Auctions

2012年5月08日 00:00:00 GMT

Optimal Parametric Auctions Azar, Pablo; Micali, Silvio Theory of Computation We study the problem of profit maximization in auctions of one good where the buyers' valuations are drawn from independent distributions. When these distributions are known to the seller, Myerson's optimal auction is a well-known mechanism for maximizing revenue. In many cases, however, the seller may not know the buyers' distributions. We propose an alternative model where the seller only knows the mean and the variance of each distribution. We call parametric an auction whose mechanism only uses these parameters. We construct parametric auctions both when the seller only has one copy of the good to sell, and when she has an infinite number of identical copies (i.e., when the good is digital). For a very large class of distributions, including (but not limited to) distributions with a monotone hazard rate, our auctions achieve a constant fraction of the revenue of Myerson's auction. When the seller has absolutely no knowledge about the distributions, it is well known that no auction can achieve a constant fraction of the optimal revenue when the players are not identically distributed. Our parametric model gives the seller a small amount of extra information, allowing her to construct auctions for which (1) no two bidders need to be drawn from identical distributions and (2) the revenue obtained is a constant fraction of the revenue in Myerson's optimal auction.

Preliminary MEG decoding results

2012年4月20日 00:00:00 GMT

Preliminary MEG decoding results Isik, Leyla; Meyers, Ethan M.; Leibo, Joel Z.; Poggio, Tomaso Decoding analysis has been applied to electrophysiology and fMRI data to study the visual system, however, this method has only been applied to MEG visual data in a few instances. Here we use the Neural Decoding Toolbox for Matlab to show that it is possible to decode visual stimuli based on MEG data.

A Method for Fast, High-Precision Characterization of Synthetic Biology Devices

2012年4月07日 00:00:00 GMT

A Method for Fast, High-Precision Characterization of Synthetic Biology Devices Beal, Jacob; Weiss, Ron; Yaman, Fusun; Davidsohn, Noah; Adler, Aaron Engineering biological systems with predictable behavior is a foundational goal of synthetic biology. To accomplish this, it is important to accurately characterize the behavior of biological devices. Prior characterization efforts, however, have generally not yielded enough high-quality information to enable compositional design. In the TASBE (A Tool-Chain to Accelerate Synthetic Biological Engineering) project we have developed a new characterization technique capable of producing such data. This document describes the techniques we have developed, along with examples of their application, so that the techniques can be accurately used by others.

Cryptographic Treatment of CryptDB's Adjustable Join

2012年3月25日 00:00:00 GMT

Cryptographic Treatment of CryptDB's Adjustable Join Popa, Raluca Ada; Zeldovich, Nickolai In this document, we provide a cryptographic treatment of the adjustable join protocol from CryptDB. We also discuss how our scheme could be used outside of CryptDB because it provides a simple functionality that may be needed in other settings. Intuitively, it is a pseudorandom permutation where an external party not knowing the secret key can nonetheless adjust a ciphertext under one key to a ciphertext under a different key, given an adjustment token from a party that knows the secret key.

A Lossy, Synchronization-Free, Race-Full, But Still Acceptably Accurate Parallel Space-Subdivision Tree Construction Algorithm

2012年2月23日 00:00:00 GMT

A Lossy, Synchronization-Free, Race-Full, But Still Acceptably Accurate Parallel Space-Subdivision Tree Construction Algorithm Rinard, Martin We present a new synchronization-free space-subdivision tree construction algorithm. Despite data races, this algorithm produces trees that are consistent enough for the client Barnes-Hut center of mass and force computation phases to use successfully. Our performance results show that eliminating synchronization improves the performance of the parallel algorithm by approximately 20%. End-to-end accuracy results show that the resulting partial data structure corruption has a neglible effect on the overall accuracy of the Barnes-Hut N-body simulation. We note that many data structure manipulation algorithms use many of the same basic operations (linked data structure updates and array insertions) as our tree construction algorithm. We therefore anticipate that the basic principles the we develop in this paper may effectively guide future efforts in this area.

DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling

2012年2月08日 00:00:00 GMT

DSENT - A Tool Connecting Emerging Photonics with Electronics for Opto-Electronic Networks-on-Chip Modeling Sun, Chen; Chen, Chia-Hsin Owen; Kurian, George; Wei, Lan; Miller, Jason; Agarwal, Anant; Peh, Li-Shiuan; Stojanovic, Vladimir With the advent of many-core chips that place substantial demand on the NoC, photonics has been investigated as a promising alternative to electrical NoCs. While numerous opto-electronic NoCs have been proposed, their evaluations tend to be based on fixed numbers for both photonic and electrical components, making it difficult to co-optimize. Through our own forays into opto-electronic NoC design, we observe that photonics and electronics are very much intertwined, reflecting a strong need for a NoC modeling tool that accurately models parameterized electronic and photonic components within a unified framework, capturing their interactions faithfully. In this paper, we present a tool, DSENT, for design space exploration of electrical and opto-electrical networks. We form a framework that constructs basic NoC building blocks from electrical and photonic technology parameters. To demonstrate potential use cases, we perform a network case study illustrating data-rate tradeoffs, a comparison with scaled electrical technology, and sensitivity to photonics parameters.

GURLS: a Toolbox for Regularized Least Squares Learning

2012年1月31日 00:00:00 GMT

GURLS: a Toolbox for Regularized Least Squares Learning Tacchetti, Andrea; Mallapragada, Pavan S.; Santoro, Matteo; Rosasco, Lorenzo We present GURLS, a toolbox for supervised learning based on the regularized least squares algorithm. The toolbox takes advantage of all the favorable properties of least squares and is tailored to deal in particular with multi-category/multi-label problems. One of the main advantages of GURLS is that it allows training and tuning a multi-category classifier at essentially the same cost of one single binary classifier. The toolbox provides a set of basic functionalities including different training strategies and routines to handle computations with very large matrices by means of both memory-mapped storage and distributed task execution. The system is modular and can serve as a basis for easily prototyping new algorithms. The toolbox is available for download, easy to set-up and use.

Toward a Probabilistic Approach to Acquiring Information from Human Partners Using Language

2012年1月23日 00:00:00 GMT

Toward a Probabilistic Approach to Acquiring Information from Human Partners Using Language Tellex, Stefanie; Thaker, Pratiksha; Deits, Robin; Simeonov, Dimitar; Kollar, Thomas; Roy, Nicholas Our goal is to build robots that can robustly interact with humans using natural language. This problem is extremely challenging because human language is filled with ambiguity, and furthermore, the robot's model of the environment might be much more limited than the human partner. When humans encounter ambiguity in dialog with each other, a key strategy to resolve it is to ask clarifying questions about whatthey do not understand. This paper describes an approach for enabling robots to take the same approach: asking the human partner clarifying questions about ambiguous commands in order to infer better actions. The robot fuses information from the command, the question, and the answer by creating a joint probabilistic graphical model in the Generalized Grounding Graph framework. We demonstrate that by performing inference using information from the command, question and answer, the robot is able to infer object groundings and follow commands with higher accuracythan by using the command alone.

A Benchmark of Computational Models of Saliency to Predict Human Fixations

2012年1月13日 00:00:00 GMT

A Benchmark of Computational Models of Saliency to Predict Human Fixations Judd, Tilke; Durand, Frédo; Torralba, Antonio Many computational models of visual attention have been created from a wide variety of different approaches to predict where people look in images. Each model is usually introduced by demonstrating performances on new images, and it is hard to make immediate comparisons between models. To alleviate this problem, we propose a benchmark data set containing 300 natural images with eye tracking data from 39 observers to compare model performances. We calculate the performance of 10 models at predicting ground truth fixations using three different metrics. We provide a way for people to submit new models for evaluation online. We find that the Judd et al. and Graph-based visual saliency models perform best. In general, models with blurrier maps and models that include a center bias perform well. We add and optimize a blur and center bias for each model and show improvements. We compare performances to baseline models of chance, center and human performance. We show that human performance increases with the number of humans to a limit. We analyze the similarity of different models using multidimensional scaling and explore the relationship between model performance and fixation consistency. Finally, we offer observations about how to improve saliency models in the future.

Structuring Unreliable Radio Networks

2011年12月22日 00:00:00 GMT

Structuring Unreliable Radio Networks Censor-Hillel, Keren; Gilbert, Seth; Kuhn, Fabian; Lynch, Nancy; Newport, Calvin In this paper we study the problem of building a connected dominating set with constant degree (CCDS) in the dual graph radio network model. This model includes two types of links: reliable links, which always deliver messages, and unreliable links, which sometimes fail to deliver messages. Real networks compensate for this differing quality by deploying low-layer detection protocols to filter unreliable from reliable links. With this in mind, we begin by presenting an algorithm that solves the CCDS problem in the dual graph model under the assumption that every process u is provided with a local "link detector set" consisting of every neighbor connected to u by a reliable link. The algorithm solves the CCDS problem in O((Delta log2(n)/b) + log3(n)) rounds, with high probability, where Delta is the maximum degree in the reliable link graph, n is the network size, and b is an upper bound in bits on the message size. The algorithm works by first building a Maximal Independent Set (MIS) in log3(n) time, and then leveraging the local topology knowledge to efficiently connect nearby MIS processes. A natural follow up question is whether the link detector must be perfectly reliable to solve the CCDS problem. To answer this question, we first describe an algorithm that builds a CCDS in O(Delta polylog(n)) time under the assumption of O(1) unreliable links included in each link detector set. We then prove this algorithm to be (almost) tight by showing that the possible inclusion of only a single unreliable link in each process's local link detector set is sufficient to require Omega(Delta) rounds to solve the CCDS problem, regardless of message size. We conclude by discussing how to apply our algorithm in the setting where the topology of reliable and unreliable links can change over time.

A Frequency Analysis of Monte-Carlo and other Numerical Integration Schemes

2011年12月14日 00:00:00 GMT

A Frequency Analysis of Monte-Carlo and other Numerical Integration Schemes Durand, Frédo The numerical calculation of integrals is central to many computer graphics algorithms such as Monte-Carlo Ray Tracing. We show that such methods can be studied using Fourier analysis. Numerical error is shown to correspond to aliasing and the link between properties of the sampling pattern and the integrand is studied. The approach also permits the unified study of image aliasing and numerical integration, by considering a multidimensional domain where some dimensions are integrated while others are sampled.

CPHash: A Cache-Partitioned Hash Table

2011年11月26日 00:00:00 GMT

CPHash: A Cache-Partitioned Hash Table Metreveli, Zviad; Zeldovich, Nickolai; Kaashoek, M. Frans CPHash is a concurrent hash table for multicore processors. CPHash partitions its table across the caches of cores and uses message passing to transfer lookups/inserts to a partition. CPHash's message passing avoids the need for locks, pipelines batches of asynchronous messages, and packs multiple messages into a single cache line transfer. Experiments on a 80-core machine with 2 hardware threads per core show that CPHash has ~1.6x higher throughput than a hash table implemented using fine-grained locks. An analysis shows that CPHash wins because it experiences fewer cache misses and its cache misses are less expensive, because of less contention for the on-chip interconnect and DRAM. CPServer, a key/value cache server using CPHash, achieves ~5% higher throughput than a key/value cache server that uses a hash table with fine-grained locks, but both achieve better throughput and scalability than memcached. Finally, the throughput of CPHash and CPServer scales near-linearly with the number of cores.

Reasoning about Relaxed Programs

2011年11月15日 00:00:00 GMT

Reasoning about Relaxed Programs Carbin, Michael; Kim, Deokhwan; Misailovic, Sasa; Rinard, Martin C. A number of approximate program transformations have recently emerged that enable transformed programs to trade accuracy of their results for increased performance by dynamically and nondeterministically modifying variables that control program execution. We call such transformed programs relaxed programs -- they have been extended with additional nondeterminism to relax their semantics and offer greater execution flexibility. We present programming language constructs for developing relaxed programs and proof rules for reasoning about properties of relaxed programs. Our proof rules enable programmers to directly specify and verify acceptability properties that characterize the desired correctness relationships between the values of variables in a program's original semantics (before transformation) and its relaxed semantics. Our proof rules also support the verification of safety properties (which characterize desirable properties involving values in individual executions). The rules are designed to support a reasoning approach in which the majority of the reasoning effort uses the original semantics. This effort is then reused to establish the desired properties of the program under the relaxed semantics. We have formalized the dynamic semantics of our target programming language and the proof rules in Coq, and verified that the proof rules are sound with respect to the dynamic semantics. Our Coq implementation enables developers to obtain fully machine checked verifications of their relaxed programs.

Fast and Robust Pyramid-based Image Processing

2011年11月15日 00:00:00 GMT

Fast and Robust Pyramid-based Image Processing Aubry, Mathieu; Paris, Sylvain; Hasinoff, Samuel W.; Kautz, Jan; Durand, Frédo Multi-scale manipulations are central to image editing but they are also prone to halos. Achieving artifact-free results requires sophisticated edgeaware techniques and careful parameter tuning. These shortcomings were recently addressed by the local Laplacian filters, which can achieve a broad range of effects using standard Laplacian pyramids. However, these filters are slow to evaluate and their relationship to other approaches is unclear. In this paper, we show that they are closely related to anisotropic diffusion and to bilateral filtering. Our study also leads to a variant of the bilateral filter that produces cleaner edges while retaining its speed. Building upon this result, we describe an acceleration scheme for local Laplacian filters that yields speed-ups on the order of 50x. Finally, we demonstrate how to use local Laplacian filters to alter the distribution of gradients in an image. We illustrate this property with a robust algorithm for photographic style transfer.

SEEC: A General and Extensible Framework for Self-Aware Computing

2011年11月07日 00:00:00 GMT

SEEC: A General and Extensible Framework for Self-Aware Computing Hoffmann, Henry; Maggio, Martina; Santambrogio, Marco D.; Leva, Alberto; Agarwal, Anant Modern systems require applications to balance competing goals, e.g. achieving high performance and low power. Achieving this balance places an unrealistic burden on application programmers who must understand the power and performance implications of a variety of application and system actions (e.g. changing algorithms or allocating cores). To address this problem, we propose the Self-aware Computing framework, or SEEC. SEEC automatically and dynamically schedules actions to meet application specified goals. While other self-aware implementations have been proposed, SEEC is uniquely distinguished by its decoupled approach, which allows application and systems programmers to separately specify observations and actions, according to their expertise. SEEC s runtime decision engine observes the system and schedules actions automatically, reducing programmer burden. This general and extensible decision engine employs both control theory and machine learning to reason about previously unseen applications and actions while automatically adapting to changes in both application and system models. This paper describes the SEEC framework and evaluates it in several case studies. SEEC is used to build an adaptive system that optimizes performance per Watt for the PARSEC benchmarks on multiple machines, achieving results as least 1.65x better than a classical control system. Additional studies show how SEEC can learn optimal resource allocation online and respond to fluctuations in the underlying hardware while managing multiple applications.

Leader Election Using Loneliness Detection

2011年10月12日 00:00:00 GMT

Leader Election Using Loneliness Detection Ghaffari, Mohsen; Lynch, Nancy; Sastry, Srikanth We consider the problem of leader election (LE) in single-hop radio networks with synchronized time slots for transmitting and receiving messages. We assume that the actual number n of processes is unknown, while the size u of the ID space is known, but is possibly much larger. We consider two types of collision detection: strong (SCD), whereby all processes detect collisions, and weak (WCD), whereby only non-transmitting processes detect collisions. We introduce loneliness detection (LD) as a key subproblem for solving LE in WCD systems. LD informs all processes whether the system contains exactly one process or more than one. We show that LD captures the difference in power between SCD and WCD, by providing an implementation of SCD over WCD and LD. We present two algorithms that solve deterministic and probabilistic LD in WCD systems with time costs of O(log(u/n)) and O(min(log(u/n), (log(1/epsilon)/n)), respectively, where epsilon is the error probability. We also provide matching lower bounds. We present two algorithms that solve deterministic and probabilistic LE in SCD systems with time costs of O(log u) and O(min(log u, loglog n + log(1/epsilon))), respectively, where epsilon is the error probability. We provide matching lower bounds.

Automatic Input Rectification

2011年10月03日 00:00:00 GMT

Automatic Input Rectification Long, Fan; Ganesh, Vijay; Carbin, Micheal; Sidiroglou, Stelios; Rinard, Martin We present a novel technique, automatic input rectification, and a prototype implementation called SOAP. SOAP learns a set of constraints characterizing typical inputs that an application is highly likely to process correctly. When given an atypical input that does not satisfy these constraints, SOAP automatically rectifies the input (i.e., changes the input so that is satisfies the learned constraints). The goal is to automatically convert potentially dangerous inputs into typical inputs that the program is highly likely to process correctly. Our experimental results show that, for a set of benchmark applications (namely, Google Picasa, ImageMagick, VLC, Swfdec, and Dillo), this approach effectively converts malicious inputs (which successfully exploit vulnerabilities in the application) into benign inputs that the application processes correctly. Moreover, a manual code analysis shows that, if an input does satisfy the learned constraints, it is incapable of exploiting these vulnerabilities. We also present the results of a user study designed to evaluate the subjective perceptual quality of outputs from benign but atypical inputs that have been automatically rectified by SOAP to conform to the learned constraints. Specifically, we obtained benign inputs that violate learned constraints, used our input rectifier to obtain rectified inputs, then paid Amazon Mechanical Turk users to provide their subjective qualitative perception of the difference between the outputs from the original and rectified inputs. The results indicate that rectification can often preserve much, and in many cases all, of the desirable data in the original input.

Multi-Class Learning: Simplex Coding And Relaxation Error

2011年9月27日 00:00:00 GMT

Multi-Class Learning: Simplex Coding And Relaxation Error Mroueh, Youssef; Poggio, Tomaso; Rosasco, Lorenzo; Slotine, Jean-Jacques E. We study multi-category classification in the framework of computational learning theory. We show how a relaxation approach, which is commonly used in binary classification, can be generalized to the multi-class setting. We propose a vector coding, namely the simplex coding, that allows to introduce a new notion of multi-class margin and cast multi-category classification into a vector valued regression problem. The analysis of the relaxation error be quantified and the binary case is recovered as a special case of our theory. From a computational point of view we can show that using the simplex coding we can design regularized learning algorithms for multi-category classification that can be trained at a complexity which is independent to the number of classes.

Nonparametric Sparsity and Regularization

2011年9月26日 00:00:00 GMT

Nonparametric Sparsity and Regularization Mosci, Sofia; Rosasco, Lorenzo; Santoro, Matteo; Verri, Alessandro; Villa, Silvia In this work we are interested in the problems of supervised learning and variable selection when the input-output dependence is described by a nonlinear function depending on a few variables. Our goal is to consider a sparse nonparametric model, hence avoiding linear or additive models. The key idea is to measure the importance of each variable in the model by making use of partial derivatives. Based on this intuition we propose and study a new regularizer and a corresponding least squares regularization scheme. Using concepts and results from the theory of reproducing kernel Hilbert spaces and proximal methods, we show that the proposed learning algorithm corresponds to a minimization problem which can be provably solved by an iterative procedure. The consistency properties of the obtained estimator are studied both in terms of prediction and selection performance. An extensive empirical analysis shows that the proposed method performs favorably with respect to the state-of-the-art.

A hypothesis-based algorithm for planning and control in non-Gaussian belief spaces

2011年8月27日 00:00:00 GMT

A hypothesis-based algorithm for planning and control in non-Gaussian belief spaces Platt, Robert, Jr.; Kaelbling, Leslie; Lozano-Perez, Tomas; Tedrake, Russ We consider the partially observable control problem where it is potentially necessary to perform complex information-gathering operations in order to localize state. One approach to solving these problems is to create plans in belief-space, the space of probability distributions over the underlying state of the system. The belief-space plan encodes a strategy for performing a task while gaining information as necessary. Most approaches to belief-space planning rely upon representing belief state in a particular way (typically as a Gaussian). Unfortunately, this can lead to large errors between the assumed density representation and the true belief state. We propose a new computationally efficient algorithm for planning in non-Gaussian belief spaces. We propose a receding horizon re-planning approach where planning occurs in a low-dimensional sampled representation of belief state while the true belief state of the system is monitored using an arbitrary accurate high-dimensional representation. Our key contribution is a planning problem that, when solved optimally on each re-planning step, is guaranteed, under certain conditions, to enable the system to gain information. We prove that when these conditions are met, the algorithm converges with probability one. We characterize algorithm performance for different parameter settings in simulation and report results from a robot experiment that illustrates the application of the algorithm to robot grasping.

Learning and disrupting invariance in visual recognition

2011年9月10日 00:00:00 GMT

Learning and disrupting invariance in visual recognition Isik, Leyla; Leibo, Joel Z; Poggio, Tomaso Learning by temporal association rules such as Foldiak's trace rule is an attractive hypothesis that explains the development of invariance in visual recognition. Consistent with these rules, several recent experiments have shown that invariance can be broken by appropriately altering the visual environment but found puzzling differences in the effects at the psychophysical versus single cell level. We show a) that associative learning provides appropriate invariance in models of object recognition inspired by Hubel and Wiesel b) that we can replicate the "invariance disruption" experiments using these models with a temporal association learning rule to develop and maintain invariance, and c) that we can thereby explain the apparent discrepancies between psychophysics and singe cells effects. We argue that these models account for the stability of perceptual invariance despite the underlying plasticity of the system, the variability of the visual world and expected noise in the biological mechanisms.

Tragedy of the routing table: An analysis of collective action amongst Internet network operators

2011年8月06日 00:00:00 GMT

Tragedy of the routing table: An analysis of collective action amongst Internet network operators Woodrow, Stephen Robert This thesis analyzes and discusses the effectiveness of social efforts to achieve collective action amongst Internet network operators in order to manage the growth of the Internet routing table. The size and rate of growth of the Internet routing table is an acknowledged challenge impeding the scalability of our BGP interdomain routing architecture. While most of the work towards a solution to this problem has focused on architectural improvements, an effort launched in the 1990s called the CIDR Report attempts to incentivize route aggregation using social forces and norms in the Internet operator community. This thesis analyzes the behavior of Internet network operators in response to the CIDR Report from 1997 to 2011 to determine whether the Report was effective in achieving this goal. While it is difficult to causally attribute aggregation behavior to appearance on the CIDR report, there is a trend for networks to improve their prefix aggregation following an appearance on the CIDR Report compared to untreated networks. This suggests that the CIDR Report did affect network aggregation behavior, although the routing table continued to grow. This aggregation improvement is most prevalent early in the study period and becomes less apparent as time goes on. Potential causes of the apparent change in efficacy of the Report are discussed and examined using Ostrom s Common Pool Resource framework. The thesis then concludes with a discussion of options for mitigating routing table growth, including the continued use of community forces to better manage the Internet routing table. S.M. thesis

MOOS-IvP Autonomy Tools Users Manual Release 4.2.1

2011年7月28日 00:00:00 GMT

MOOS-IvP Autonomy Tools Users Manual Release 4.2.1 Benjamin, Michael R. This document describes 19 MOOS-IvP autonomy tools. uHelmScope provides a run-time scoping window into the state of an active IvP Helm executing its mission. pMarineViewer is a geo-based GUI tool for rendering marine vehicles and geometric data in their operational area. uXMS is a terminal based tool for scoping on a MOOSDB process. uTermCommand is a terminal based tool for poking a MOOSDB with a set of MOOS file pre-defined variable-value pairs selectable with aliases from the command-line. pEchoVar provides a way of echoing a post to one MOOS variable with a new post having the same value to a different variable. uProcessWatch monitors the presence or absence of a set of MOOS processes and summarizes the collective status in a single MOOS variable. uPokeDB provides a way of poking the MOOSDB from the command line with one or more variable-value pairs without any pre-existing configuration of a MOOS file. uTimerScript will execute a pre-defined timed pausable script of poking variable-value pairs to a MOOSDB. pNodeReporter summarizes a platforms critical information into a single node report string for sharing beyond the vehicle. pBasicContactMgr provides a basic contact management service with the ability to generate range-dependent configurable alerts. uSimMarine provides a simple marine vehicle simulator. uSimBeaconRange and uSimContactRange provide further simulation for range-only sensors. The Alog Toolbox is a set of offline tools for analyzing and manipulating log files in the .alog format.

An Overview of MOOS-IvP and a Users Guide to the IvP Helm - Release 4.2.1

2011年8月03日 00:00:00 GMT

An Overview of MOOS-IvP and a Users Guide to the IvP Helm - Release 4.2.1 Benjamin, Michael R.; Schmidt, Henrik; Newman, Paul; Leonard, John J. This document describes the IvP Helm - an Open Source behavior-based autonomy application for unmanned vehicles. IvP is short for interval programming - a technique for representing and solving multi-objective optimizations problems. Behaviors in the IvP Helm are reconciled using multi-objective optimization when in competition with each other for influence of the vehicle. The IvP Helm is written as a MOOS application where MOOS is a set of Open Source publish-subscribe autonomy middleware tools. This document describes the configuration and use of the IvP Helm, provides examples of simple missions and information on how to download and build the software from the MOOS-IvP server at www.moos-ivp.org.

Vote the OS off your Core

2011年7月27日 00:00:00 GMT

Vote the OS off your Core Belay, Adam; Wentzlaff, David; Agarwal, Anant Recent trends in OS research have shown evidence that there are performance benefits to running OS services on different cores than the user applications that rely on them. We quantitatively evaluate this claim in terms of one of the most significant architectural constraints: memory performance. To this end, we have created CachEMU, an open-source memory trace generator and cache simulator built as an extension to QEMU for working with system traces. Using CachEMU, we determined that for five common Linux test workloads, it was best to run the OS close, but not too close on the same package, but not on the same core.

A Scalable Information Theoretic Approach to Distributed Robot Coordination

2011年9月25日 00:00:00 GMT

A Scalable Information Theoretic Approach to Distributed Robot Coordination Julian, Brian J.; Angermann, Michael; Schwager, Mac; Rus, Daniela This paper presents a scalable information theoretic approach to infer the state of an environment by distributively controlling robots equipped with sensors. The robots iteratively estimate the environment state using a recursive Bayesian filter, while continuously moving to improve the quality of the estimate by following the gradient of mutual information. Both the filter and the controller use a novel algorithm for approximating the robots' joint measurement probabilities, which combines consensus (for decentralization) and sampling (for scalability). The approximations are shown to approach the true joint measurement probabilities as the size of the consensus rounds grows or as the network becomes complete. The resulting gradient controller runs in constant time with respect to the number of robots, and linear time with respect to the number of sensor measurements and environment discretization cells, while traditional mutual information methods are exponential in all of these quantities. Furthermore, the controller is proven to be convergent between consensus rounds and, under certain conditions, is locally optimal. The complete distributed inference and coordination algorithm is demonstrated in experiments with five quad-rotor flying robots and simulations with 100 robots.

Kernels for Vector-Valued Functions: a Review

2011年6月30日 00:00:00 GMT

Kernels for Vector-Valued Functions: a Review Alvarez, Mauricio A.; Rosasco, Lorenzo; Lawrence, Neil D. Kernel methods are among the most popular techniques in machine learning. From a frequentist/discriminative perspective they play a central role in regularization theory as they provide a natural choice for the hypotheses space and the regularization functional through the notion of reproducing kernel Hilbert spaces. From a Bayesian/generative perspective they are the key in the context of Gaussian processes, where the kernel function is also known as the covariance function. Traditionally, kernel methods have been used in supervised learning problem with scalar outputs and indeed there has been a considerable amount of work devoted to designing and learning kernels. More recently there has been an increasing interest in methods that deal with multiple outputs, motivated partly by frameworks like multitask learning. In this paper, we review different methods to design or learn valid kernel functions for multiple outputs, paying particular attention to the connection between probabilistic and functional methods.

A Software Approach to Unifying Multicore Caches

2011年6月28日 00:00:00 GMT

A Software Approach to Unifying Multicore Caches Boyd-Wickizer, Silas; Kaashoek, M. Frans; Morris, Robert; Zeldovich, Nickolai Multicore chips will have large amounts of fast on-chip cache memory, along with relatively slow DRAM interfaces. The on-chip cache memory, however, will be fragmented and spread over the chip; this distributed arrangement is hard for certain kinds of applications to exploit efficiently, and can lead to needless slow DRAM accesses. First, data accessed from many cores may be duplicated in many caches, reducing the amount of distinct data cached. Second, data in a cache distant from the accessing core may be slow to fetch via the cache coherence protocol. Third, software on each core can only allocate space in the small fraction of total cache memory that is local to that core. A new approach called software cache unification (SCU) addresses these challenges for applications that would be better served by a large shared cache. SCU chooses the on-chip cache in which to cache each item of data. As an application thread reads data items, SCU moves the thread to the core whose on-chip cache contains each item. This allows the thread to read the data quickly if it is already on-chip; if it is not, moving the thread causes the data to be loaded into the chosen on-chip cache. A new file cache for Linux, called MFC, uses SCU to improve performance of file-intensive applications, such as Unix file utilities. An evaluation on a 16-core AMD Opteron machine shows that MFC improves the throughput of file utilities by a factor of 1.6. Experiments with a platform that emulates future machines with less DRAM throughput per core shows that MFC will provide benefit to a growing range of applications.

A hierarchical model of peripheral vision

2011年6月17日 00:00:00 GMT

A hierarchical model of peripheral vision Isik, Leyla; Leibo, Joel Z.; Mutch, Jim; Lee, Sang Wan; Poggio, Tomaso We present a peripheral vision model inspired by the cortical architecture discovered by Hubel and Wiesel. As with existing cortical models, this model contains alternating layers of simple cells, which employ tuning functions to increase specificity, and complex cells, which pool over simple cells to increase invariance. To extend the traditional cortical model, we introduce the option of eccentricity-dependent pooling and tuning parameters within a given model layer. This peripheral vision system can be used to model physiological data where receptive field sizes change as a function of eccentricity. This gives the user flexibility to test different theories about filtering and pooling ranges in the periphery. In a specific instantiation of the model, pooling and tuning parameters can increase linearly with eccentricity to model physiological data found in different layers of the visual cortex. Additionally, it can be used to introduce pre-cortical model layers such as retina and LGN. We have tested the model s response with different parameters on several natural images to demonstrate its effectiveness as a research tool. The peripheral vision model presents a useful tool to test theories about crowding, attention, visual search, and other phenomena of peripheral vision.

Scalable Information-Sharing Network Management

2011年6月07日 00:00:00 GMT

Scalable Information-Sharing Network Management Guo, Nina X. This thesis analyzes scalable information-sharing network management. It looks into one of the large problems in network management today: finding information across different network domains. Information-sharing network management is a method to solving the problem, though it is important to make it scalable. The solution proposed uses the Publish-Subscribe Internet Routing Paradigm (PSIRP) inter-domain design as the base structure. The design borrows from Border Gateway Protocol ideas and uses the Chord protocol as one of the key methods of finding information. The conclusion after analyzing the scalability of PSIRP is that its use of Chord gives it an advantage that allows a O(log^2 N) tradeoff between performance and distribution. MEng thesis

Regularization Predicts While Discovering Taxonomy

2011年6月03日 00:00:00 GMT

Regularization Predicts While Discovering Taxonomy Mroueh, Youssef; Poggio, Tomaso; Rosasco, Lorenzo In this work we discuss a regularization framework to solve multi-category when the classes are described by an underlying class taxonomy. In particular we discuss how to learn the class taxonomy while learning a multi-category classifier.

Understanding the Performance of Broadband Networks through the Statistical Analysis of Speed Tests - Supplemental materials

2011年5月10日 00:00:00 GMT

Understanding the Performance of Broadband Networks through the Statistical Analysis of Speed Tests - Supplemental materials García, Rubén Supplemental materials for the master thesis "Understanding the Performance of Broadband Networks Through the Statistical Analysis of Speed Tests", by Rubén García, submitted in May 2011 for the S.M. in Technology and Policy. Supplemental materials include: Source_code: Folder containing the source code for the statistical analysis of NDT speed tests, written for the R statistical package; NDT_data: Folder containing the following datasets (1) ndt4.h5: Initial NDT data that we used for the analysis; (2) ndt3.h5: Reduced version of the ndt4 dataset (same tests but less variables), also contains the 'whois' file that we combine with the NDT data in order to add location information; (3) comcast-ndt.h5: dataset containing the speed tests of a controlled experiment that we ran using different test durations; Aggregated_datasets: Versions of the ndt4.h5 dataset aggregated by IP and by Autonomous System.

jMWE v1.0.0

2011年1月01日 00:00:00 GMT

jMWE v1.0.0 Finlayson, Mark Alan; Kulkarni, Nidhi jMWE is a Java library for constructing and testing Multi-Word Expression detectors. The library has three main facilities: (1) a detector API, (2) a MWE index facility, and (3) a test harness. This is version 1.0.0 of the library. It contains the source code, compiled binary files, javadocs, a user's manual (pdf), and data for constructing a default MWE index. The freely available version of jMWE is licensed for use for non-commercial purposes only, as long as proper acknowledgment is made. Details can be found in the license, which is included at the end of this document. The copyright on the software is owned by MIT; if you wish to use the software for commercial purposes, please contact the MIT Technology Licensing Office for more information on how to obtain a commercial license. "June 2011."

Source code and data for MWE'2011 papers

2011年5月09日 00:00:00 GMT

Source code and data for MWE'2011 papers Finlayson, Mark Alan; Kulkarni, Nidhi Contains the source code and data necessary to run all computations described in the following two papers: Finlayson, Mark A. and Kulkarni, Nidhi (2011) "Detecting Multi-Word Expressions improves Word Sense Disambiguation", in Proceedings of the 2011 Workshop on Multiword Expressions, held at ACL'2011 in Portland, OR; Kulkarni, Nidhi and Finlayson, Mark A. (2011) "jMWE: A Java Toolkit for Detecting Multi-Word Expressions" in Proceedings of the 2011 Workshop on Multiword Expressions, held at ACL'2011 in Portland, OR.

Library Cache Coherence

2011年5月02日 00:00:00 GMT

Library Cache Coherence Shim, Keun Sup; Cho, Myong Hyon; Lis, Mieszko; Khan, Omer; Devadas, Srinivas Directory-based cache coherence is a popular mechanism for chip multiprocessors and multicores. The directory protocol, however, requires multicast for invalidation messages and the collection of acknowledgement messages, which can be expensive in terms of latency and network traffic. Furthermore, the size of the directory increases with the number of cores. We present Library Cache Coherence (LCC), which requires neither broadcast/multicast for invalidations nor waiting for invalidation acknowledgements. A library is a set of timestamps that are used to auto-invalidate shared cache lines, and delay writes on the lines until all shared copies expire. The size of library is independent of the number of cores. By removing the complex invalidation process of directory-based cache coherence protocols, LCC generates fewer network messages. At the same time, LCC also allows reads on a cache block to take place while a write to the block is being delayed, without breaking sequential consistency. As a result, LCC has 1.85X less average memory latency than a MESI directory-based protocol on our set of benchmarks, even with a simple timestamp choosing algorithm; moreover, our experimental results on LCC with an ideal timestamp scheme (though not implementable) show the potential of further improvement for LCC with more sophisticated timestamp schemes.

Comparison of User Traffic Characteristics on Mobile-Access versus Fixed-Access Networks

2011年5月03日 00:00:00 GMT

Comparison of User Traffic Characteristics on Mobile-Access versus Fixed-Access Networks Heikkinen, Mikko V. J.; Berger, Arthur W. We compare Web traffic characteristics of mobile- versus fixed-access end-hosts, where herein the term "mobile" refers to access via cell towers, using for example the 3G/UMTS standard, and the term "fixed" includes Wi-Fi access. It is well-known that connection speeds are in general slower over mobile-access networks, and also that often there is higher packet loss. We were curious whether this leads mobile-access users to have smaller connections. We examined the distribution of the number of bytes-per-connection, and packet loss from a sampling of logs from servers of Akamai Technologies. We obtained 149 million connections, across 57 countries. The mean bytes-per-connection was typically larger for fixed-access: for two-thirds of the countries, it was at least one-third larger. Regarding distributions, we found that the difference between the bytes-per-connection for mobile- versus fixed-access, as well as the packet loss, was statistically significant for each of the countries; however the visual difference in plots is typically small. For some countries, mobile-access had the larger connections. As expected, mobile-access often had higher loss than fixed-access, but the reverse pertained for some countries. Typically packet loss increased during the busy period of the day, when mobile-access had a larger increase. Comparing our results from 2010 to those from 2009 of the same time period, we found that connections have become a bit smaller.

ARBAC Policy for a Large Multi-National Bank

2011年4月27日 00:00:00 GMT

ARBAC Policy for a Large Multi-National Bank Jayaraman, Karthick; Ganesh, Vijay; Tripunitara, Mahesh; Rinard, Martin C.; Chapin, Steve J. Administrative role-based access control (ARBAC) is the first comprehensive administrative model proposed for role-based access control (RBAC). ARBAC has several features for designing highly expressive policies, but current work has not highlighted the utility of these expressive policies. In this report, we present a case study of designing an ARBAC policy for a bank comprising 18 branches. Using this case study we provide an assessment about the features of ARBAC that are likely to be used in realistic policies.

Collusive Dominant-Strategy Truthfulness

2011年4月22日 00:00:00 GMT

Collusive Dominant-Strategy Truthfulness Chen, Jinc; Micali, Silvio Fifty years ago, Vickrey published his famous mechanism for auctioning a single good in limited supply. The main property of Vickrey's mechanism is efficiency in dominant strategies. In absence of collusion, this is a wonderful efficiency guarantee. We note, however, that collusion is far from rare in auctions, and if some colluders exist and have some wrong beliefs, then the Vickrey mechanism dramatically loses its efficiency. Accordingly, we put forward a new mechanism that, despite unconstrained collusion, guarantees efficiency by providing a richer set of strategies and ensuring that it is dominant for every player to reveal truthfully not only his own valuation, but also with whom he is colluding, if he is indeed colluding with someone else. Our approach meaningfully bypasses prior impossibility proofs.

Mechanism Design with Approximate Valuations

2011年2月16日 00:00:00 GMT

Mechanism Design with Approximate Valuations Chiesa, Alessandro; Micali, Silvio; Zhu, Zeyuan Allen In mechanism design, we replace the strong assumption that each player knows his own payoff type EXACTLY with the more realistic assumption that he knows it only APPROXIMATELY. Specifically, we study the classical problem of maximizing social welfare in single-good auctions when players know their true valuations only within a constant multiplicative factor d in (0,1). Our approach is deliberately non-Bayesian and very conservative: each player i only knows that his true valuation is one among finitely many values in a d-APPROXIMATE SET, Ki, and his true valuation is ADVERSARIALLY and SECRETLY chosen in Ki at the beginning of the auction. We prove tight upper and lower bounds for the fraction of the maximum social welfare achievable in our model, in either dominant or undominated strategies, both via deterministic and probabilistic mechanisms. The landscape emerging is quite unusual and intriguing.

Partial Reversal Acyclicity

2011年4月14日 00:00:00 GMT

Partial Reversal Acyclicity Radeva, Tsvetomira; Lynch, Nancy Partial Reversal (PR) is a link reversal algorithm which ensures that the underlying graph structure is destination-oriented and acyclic. These properties of PR make it useful in routing protocols and algorithms for solving leader election and mutual exclusion. While proofs exist to establish the acyclicity property of PR, they rely on assigning labels to either the nodes or the edges in the graph. In this work we present simpler direct proof of the acyclicity property of partial reversal without using any external or dynamic labeling mechanism. First, we provide a simple variant of the PR algorithm, and show that it maintains acyclicity. Next, we present a binary relation which maps the original PR algorithm to the new algorithm, and finally, we conclude that the acyclicity proof applies to the original PR algorithm as well.

Gasping for AIR Why we need Linked Rules and Justifications on the Semantic Web

2011年4月16日 00:00:00 GMT

Gasping for AIR Why we need Linked Rules and Justifications on the Semantic Web Kagal, Lalana; Jacobi, Ian; Khandelwal, Ankesh The Semantic Web is a distributed model for publishing, utilizing and extending structured information using Web protocols. One of the main goals of this technology is to automate the retrieval and integration of data and to enable the inference of interesting results. This automation requires logics and rule languages that make inferences, choose courses of action, and answer questions. The openness of the Web, however, leads to several issues including the handling of inconsistencies, integration of diverse information, and the determination of the quality and trustworthiness of the data. AIR is a Semantic Web-based rule language that provides this functionality while focusing on generating and tracking explanations for its inferences and actions as well as conforming to Linked Data principles. AIR supports Linked Rules, which allow rules to be combined, re-used and extended in a manner similar to Linked Data. Additionally, AIR explanations themselves are Semantic Web data so they can be used for further reasoning. In this paper we present an overview of AIR, discuss its potential as a Web rule language by providing examples of how its features can be leveraged for different inference requirements, and describe how justifications are represented and generated.

Approximations in the HMAX Model

2011年4月14日 00:00:00 GMT

Approximations in the HMAX Model Chikkerur, Sharat; Poggio, Tomaso The HMAX model is a biologically motivated architecture for computer vision whose components are in close agreement with existing physiological evidence. The model is capable of achieving close to human level performance on several rapid object recognition tasks. However, the model is computationally bound and has limited engineering applications in its current form. In this report, we present several approximations in order to increase the efficiency of the HMAX model. We outline approximations at several levels of the hierarchy and empirically evaluate the trade-offs between efficiency and accuracy. We also explore ways to quantify the representation capacity of the model.

Efficient Marginal Likelihood Optimization in Blind Deconvolution

2011年4月04日 00:00:00 GMT

Efficient Marginal Likelihood Optimization in Blind Deconvolution Levin, Anat; Weiss, Yair; Durand, Fredo; Freeman, William T. In blind deconvolution one aims to estimate from an input blurred image y a sharp image x and an unknown blur kernel k. Recent research shows that a key to success is to consider the overall shape of the posterior distribution p(x, k|y) and not only its mode. This leads to a distinction between MAPx,k strategies which estimate the mode pair x, k and often lead to undesired results, and MAPk strategies which select the best k while marginalizing over all possible x images. The MAPk principle is significantly more robust than the MAPx,k one, yet, it involves a challenging marginalization over latent images. As a result, MAPk techniques are considered complicated, and have not been widely exploited. This paper derives a simple approximated MAPk algorithm which involves only a modest modification of common MAPx,k algorithms. We show that MAPk can, in fact, be optimized easily, with no additional computational complexity.

A Comparison of Autonomic Decision Making Techniques

2011年4月01日 00:00:00 GMT

A Comparison of Autonomic Decision Making Techniques Maggio, Martina; Hoffmann, Henry; Santambrogio, Marco D.; Agarwal, Anant; Leva, Alberto Autonomic computing systems are capable of adapting their behavior and resources thousands of times a second to automatically decide the best way to accomplish a given goal despite changing environmental conditions and demands. Different decision mechanisms are considered in the literature, but in the vast majority of the cases a single technique is applied to a given instance of the problem. This paper proposes a comparison of some state of the art approaches for decision making, applied to a self-optimizing autonomic system that allocates resources to a software application, which provides direct performance feedback at runtime. The Application Heartbeats framework is used to provide the sensor data (feedback), and a variety of decision mechanisms, from heuristics to control-theory and machine learning, are investigated. The results obtained with these solutions are compared by means of case studies using standard benchmarks.

Remote Oblivious Storage: Making Oblivious RAM Practical

2011年3月30日 00:00:00 GMT

Remote Oblivious Storage: Making Oblivious RAM Practical Boneh, Dan; Mazieres, David; Popa, Raluca Ada Remote storage of data has become an increasingly attractive and advantageous option, especially due to cloud systems. While encryption protects the data, it does not hide the access pattern to the data. A natural solution is to access remote storage using an Oblivious RAM (ORAM) which provably hides all access patterns. While ORAM is asymptotically efficient, the best existing scheme (Pinkas and Reinman, Crypto'10) still has considerable overhead for a practical implementation: for M stored items, it stores 4 times and sometimes 6 times more items remotely, requires O(log2 M) round trips to storage server per request, and periodically blocks all data requests to shuffle all storage (which is a lengthy process). In this paper, we first define a related notion to ORAM, oblivious storage (OS), which captures more accurately and naturally the security setting of remote storage. Then, we propose a new ORAM/OS construction that solves the practicality issues just outlined: it has a storage constant of ~ 1, achieves O(1) round trips to the storage server per request, and allows requests to happen concurrently with shuffle without jeopardizing security. Our construction consists of a new organization of server memory into a flat main part and a hierarchical shelter, a client-side index for rapidly locating identifiers at the server, a new shelter serving requests concurrent with the shuffle, and a data structure for locating items efficiently in a partially shuffled storage.

Multicore Performance Optimization Using Partner Cores

2011年3月25日 00:00:00 GMT

Multicore Performance Optimization Using Partner Cores Lau, Eric; Miller, Jason E; Choi, Inseok; Yeung, Donald; Amarasinghe, Saman; Agarwal, Anant As the push for parallelism continues to increase the number of cores on a chip, and add to the complexity of system design, the task of optimizing performance at the application level becomes nearly impossible for the programmer. Much effort has been spent on developing techniques for optimizing performance at runtime, but many techniques for modern processors employ the use of speculative threads or performance counters. These approaches result in stolen cycles, or the use of an extra core, and such expensive penalties put demanding constraints on the gains provided by such methods. While processors have grown in power and complexity, the technology for small, efficient cores has emerged. We introduce the concept of Partner Cores for maximizing hardware power efficiency; these are low-area, low-power cores situated on-die, tightly coupled to each main processor core. We demonstrate that such cores enable performance improvement without incurring expensive penalties, and carry out potential applications that are impossible on a traditional chip multiprocessor.

SEEC: A Framework for Self-aware Management of Multicore Resources

2011年3月24日 00:00:00 GMT

SEEC: A Framework for Self-aware Management of Multicore Resources Hoffmann, Henry; Maggio, Martina; Santambrogio, Marco D.; Leva, Alberto; Agarwal, Anant This paper presents SEEC, a self-aware programming model, designed to reduce programming effort in modern multicore systems. In the SEEC model, application programmers specify application goals and progress, while systems programmers separately specify actions system software and hardware can take to affect an application (e.g. resource allocation). The SEEC runtime monitors applications and dynamically selects actions to meet application goals optimally (e.g. meeting performance while minimizing power consumption). The SEEC runtime optimizes system behavior for the application rather than requiring the application programmer to optimize for the system. This paper presents a detailed discussion of the SEEC model and runtime as well as several case studies demonstrating their benefits. SEEC is shown to optimize performance per Watt for a video encoder, find optimal resource allocation for an application with complex resource usage, and maintain the goals of multiple applications in the face of environmental fluctuations.

Intel Concurrent Collections for Haskell

2011年3月22日 00:00:00 GMT

Intel Concurrent Collections for Haskell Newton, Ryan; Chen, Chih-Ping; Marlow, Simon Intel Concurrent Collections (CnC) is a parallel programming model in which a network of steps (functions) communicate through message-passing as well as a limited form of shared memory. This paper describes a new implementation of CnC for Haskell. Compared to existing parallel programming models for Haskell, CnC occupies a useful point in the design space: pure and deterministic like Evaluation Strategies, but more explicit about granularity and the structure of the parallel computation, which affords the programmer greater control over parallel performance. We present results on 4, 8, and 32-core machines demonstrating parallel speedups over 20x on non-trivial benchmarks.

BOOM: Broadcast Optimizations for On-chip Meshes

2011年3月14日 00:00:00 GMT

BOOM: Broadcast Optimizations for On-chip Meshes Krishna, Tushar; Beckmann, Bradford M.; Peh, Li-Shiuan; Reinhardt, Steven K. Future many-core chips will require an on-chip network that can support broadcasts and multicasts at good power-performance. A vanilla on-chip network would send multiple unicast packets for each broadcast packet, resulting in latency, throughput and power overheads. Recent research in on-chip multicast support has proposed forking of broadcast/multicast packets within the network at the router buffers, but these techniques are far from ideal, since they increase buffer occupancy which lowers throughput, and packets incur delay and power penalties at each router. In this work, we analyze an ideal broadcast mesh; show the substantial gaps between state-of-the-art multicast NoCs and the ideal; then propose BOOM, which comprises a WHIRL routing protocol that ideally load balances broadcast traffic, a mXbar multicast crossbar circuit that enables multicast traversal at similar energy-delay as unicasts, and speculative bypassing of buffering for multicast flits. Together, they enable broadcast packets to approach the delay, energy, and throughput of the ideal fabric. Our simulations show BOOM realizing an average network latency that is 5% off ideal, attaining 96% of ideal throughput, with energy consumption that is 9% above ideal. Evaluations using synthetic traffic show BOOM achieving a latency reduction of 61%, throughput improvement of 63%, and buffer power reduction of 80% as compared to a baseline broadcast. Simulations with PARSEC benchmarks show BOOM reducing average request and network latency by 40% and 15% respectively.

Fleets: Scalable Services in a Factored Operating System

2011年3月09日 00:00:00 GMT

Fleets: Scalable Services in a Factored Operating System Wentzlaff, David; Gruenwald, Charles, III; Beckmann, Nathan; Belay, Adam; Kasture, Harshad; Modzelewski, Kevin; Youseff, Lamia; Miller, Jason E.; Agarwal, Anant Current monolithic operating systems are designed for uniprocessor systems, and their architecture reflects this. The rise of multicore and cloud computing is drastically changing the tradeoffs in operating system design. The culture of scarce computational resources is being replaced with one of abundant cores, where spatial layout of processes supplants time multiplexing as the primary scheduling concern. Efforts to parallelize monolithic kernels have been difficult and only marginally successful, and new approaches are needed. This paper presents fleets, a novel way of constructing scalable OS services. With fleets, traditional OS services are factored out of the kernel and moved into user space, where they are further parallelized into a distributed set of concurrent, message-passing servers. We evaluate fleets within fos, a new factored operating system designed from the ground up with scalability as the first-order design constraint. This paper details the main design principles of fleets, and how the system architecture of fos enables their construction. We describe the design and implementation of three critical fleets (network stack, page allocation, and file system) and compare with Linux. These comparisons show that fos achieves superior performance and has better scalability than Linux for large multicores; at 32 cores, fos's page allocator performs 4.5 times better than Linux, and fos's network stack performs 2.5 times better. Additionally, we demonstrate how fleets can adapt to changing resource demand, and the importance of spatial scheduling for good performance in multicores.

Werner Reichardt: the man and his scientific legacy

2011年3月04日 00:00:00 GMT

Werner Reichardt: the man and his scientific legacy Poggio, Tomaso Geiger, Gadi Excerpts from a talk given by Tomaso Poggio in Tübingen on the opening ofthe Werner Reichardt Centrun für Integrative Neurowissenschaften, December 8, 2008.

Decomposing Broadcast Algorithms Using Abstract MAC Layers

2011年2月23日 00:00:00 GMT

Decomposing Broadcast Algorithms Using Abstract MAC Layers Khabbazian, Majid; Kowalski, Dariusz; Kuhn, Fabian; Lynch, Nancy In much of the theoretical literature on global broadcast algorithms for wireless networks, issues of message dissemination are considered together with issues of contention management. This combination leads to complicated algorithms and analysis, and makes it difficult to extend the work to more difficult communication problems. In this paper, we present results aimed at simplifying such algorithms and analysis by decomposing the treatment into two levels, using abstract "MAC layer" specifications to encapsulate contention management. We use two different abstract MAC layers: the basic layer of Kuhn, Lynch, and Newport, and a new probabilistic layer. We first present a typical randomized contention-management algorithm for a standard graph-based radio network model and show that it implements both abstract MAC layers. Then we combine this algorithm with greedy algorithms for single-message and multi-message global broadcast and analyze the combinations, using both abstract MAC layers as intermediate layers. Using the basic MAC layer, we prove a bound of O(D log(n / epsilon) log(Delta)) for the time to deliver a single message everywhere with probability 1 - epsilon, where D is the network diameter, n is the number of nodes, and Delta is the maximum node degree. Using the probabilistic layer, we prove a bound of O((D + log(n/epsilon)) log(Delta)), which matches the best previously-known bound for single-message broadcast over the physical network model. For multi-message broadcast, we obtain bounds of O((D + k Delta) log(n/epsilon) log(Delta)) using the basic layer and O((D + k Delta log(n/epsilon)) log(Delta)) using the probabilistic layer, for the time to deliver a message everywhere in the presence of at most k concurrent messages.

SoftCast: Clean-slate Scalable Wireless Video

2011年2月15日 00:00:00 GMT

SoftCast: Clean-slate Scalable Wireless Video Jakubczak, Szymon; Katabi, Dina Video broadcast and mobile video challenge the conventional wireless design. In broadcast and mobile scenarios the bit rate supported by the channel differs across receivers and varies quickly over time. The conventional design however forces the source to pick a single bit rate and degrades sharply when the channel cannot not support the chosen bit rate. This paper presents SoftCast, a clean-slate design for wireless video where the source transmits one video stream that each receiver decodes to a video quality commensurate with its specific instantaneous channel quality. To do so, SoftCast ensures the samples of the digital video signal transmitted on the channel are linearly related to the pixels' luminance. Thus, when channel noise perturbs the transmitted signal samples, the perturbation naturally translates into approximation in the original video pixels. Hence, a receiver with a good channel (low noise) obtains a high fidelity video, and a receiver with a bad channel (high noise) obtains a low fidelity video. We implement SoftCast using the GNURadio software and the USRP platform. Results from a 20-node testbed show that SoftCast improves the average video quality (i.e., PSNR) across broadcast receivers in our testbed by up to 5.5dB. Even for a single receiver, it eliminates video glitches caused by mobility and increases robustness to packet loss by an order of magnitude.

Mechanism Design With Approximate Player Types

2011年2月16日 00:00:00 GMT

Mechanism Design With Approximate Player Types Chiesa, Alessandro; Micali, Silvio; Zhu, Zeyuan Allen We investigate mechanism design when the players do not exactly know their types, but have instead only partial information about them.

Towards Understanding Hierarchical Natural Language Commands for Robotic Navigation and Manipulation

2011年2月01日 00:00:00 GMT

Towards Understanding Hierarchical Natural Language Commands for Robotic Navigation and Manipulation Kollar, Thomas; Dickerson, Steven; Tellex, Stefanie; Banerjee, Ashis Gopal; Walter, Matthew R.; Teller, Seth; Roy, Nicholas We describe a new model for understanding hierarchical natural language commands for robot navigation and manipulation. The model has three components: a semantic structure that captures the hierarchical structure of language; a cost function that maps the command's semantic structure to the robot's sensorimotor capabilities; and an efficient search method for finding the lowest-cost plan. We present a proof-of-concept system that carries out navigation commands in a simulated setting.

What is Decidable about Strings?

2011年2月01日 00:00:00 GMT

What is Decidable about Strings? Ganesh, Vijay; Minnes, Mia; Solar-Lezama, Armando; Rinard, Martin We prove several decidability and undecidability results for the satisfiability/validity problem of formulas over a language of finite-length strings and integers (interpreted as lengths of strings). The atomic formulas over this language are equality over string terms (word equations), linear inequality over length function (length constraints), and membership predicate over regularexpressions (r.e.). These decidability questions are important in logic, program analysis and formal verification. Logicians have been attempting to resolve some of these questions for many decades, while practical satisfiability procedures for these formulas are increasingly important in the analysis of string-manipulating programs such as web applications and scripts. We prove three main theorems. First, we consider Boolean combination of quantifier-free formulas constructed out of word equations and length constraints. We show that if word equations can be converted to a solved form, a form relevant in practice, then the satisfiability problem for Boolean combination of word equations and length constraints is decidable. Second, we show that the satisfiability problem for word equations in solved form that areregular, length constraints and r.e. membership predicate is also decidable. Third, we show that the validity problem for the set of sentences written as a forall-exists quantifier alternation applied to positive word equations is undecidable. A corollary of this undecidability result is that this set is undecidable even with sentences with at most two occurrences of a string variable.

CryptDB: A Practical Encrypted Relational DBMS

2011年1月26日 00:00:00 GMT

CryptDB: A Practical Encrypted Relational DBMS Popa, Raluca Ada; Zeldovich, Nickolai; Balakrishnan, Hari CryptDB is a DBMS that provides provable and practical privacy in the face of a compromised database server or curious database administrators. CryptDB works by executing SQL queries over encrypted data. At its core are three novel ideas: an SQL-aware encryption strategy that maps SQL operations to encryption schemes, adjustable query-based encryption which allows CryptDB to adjust the encryption level of each data item based on user queries, and onion encryption to efficiently change data encryption levels. CryptDB only empowers the server to execute queries that the users requested, and achieves maximum privacy given the mix of queries issued by the users. The database server fully evaluates queries on encrypted data and sends the result back to the client for final decryption; client machines do not perform any query processing and client-side applications run unchanged. Our evaluation shows that CryptDB has modest overhead: on the TPC-C benchmark on Postgres, CryptDB reduces throughput by 27% compared to regular Postgres. Importantly, CryptDB does not change the innards of existing DBMSs: we realized the implementation of CryptDB using client-side query rewriting/encrypting, user-defined functions, and server-side tables for public key information. As such, CryptDB is portable; porting CryptDB to MySQL required changing 86 lines of code, mostly at the connectivity layer.

Multi-Output Learning via Spectral Filtering

2011年1月24日 00:00:00 GMT

Multi-Output Learning via Spectral Filtering Baldassarre, Luca; Rosasco, Lorenzo; Barla, Annalisa; Verri, Alessandro In this paper we study a class of regularized kernel methods for vector-valued learning which are based on filtering the spectrum of the kernel matrix. The considered methods include Tikhonov regularization as a special case, as well as interesting alternatives such as vector-valued extensions of L2 boosting. Computational properties are discussed for various examples of kernels for vector-valued functions and the benefits of iterative techniques are illustrated. Generalizing previous results for the scalar case, we show finite sample bounds for the excess risk of the obtained estimator and, in turn, these results allow to prove consistency both for regression and multi-category classification. Finally, we present some promising results of the proposed algorithms on artificial and real data.

Probabilistic and Statistical Analysis of Perforated Patterns

2011年1月19日 00:00:00 GMT

Probabilistic and Statistical Analysis of Perforated Patterns Misailovic, Sasa; Roy, Daniel M.; Rinard, Martin We present a new foundation for the analysis and transformation of computer programs.Standard approaches involve the use of logical reasoning to prove that the applied transformation does not change the observable semantics of the program. Our approach, in contrast, uses probabilistic and statistical reasoning to justify the application of transformations that may change, within probabilistic bounds, the result that the program produces. Loop perforation transforms loops to execute fewer iterations. We show how to use our basic approach to justify the application of loop perforation to a set of computational patterns. Empirical results from computations drawn from the PARSEC benchmark suite demonstrate that these computational patterns occur in practice. We also outline a specification methodology that enables the transformation of subcomputations and discuss how to automate the approach.

Flexible Execution of Plans with Choice and Uncertainty

2011年1月15日 00:00:00 GMT

Flexible Execution of Plans with Choice and Uncertainty Conrad, Patrick R; Williams, Brian C Dynamic plan execution strategies allow an autonomous agent to respond to uncertainties, while improving robustness and reducing the need for an overly conservative plan. Executives have improved robustness by expanding the types of choices made dynamically, such as selecting alternate methods. However, in some approaches to date, these additional choices often induce significant storage requirements to make flexible execution possible. This paper presents a novel system called Drake, which is able to dramatically reduce the storage requirements in exchange for increased execution time for some computations. Drake frames a plan as a collection of related Simple Temporal Problems, and executes the plan with a fast dynamic scheduling algorithm. This scheduling algorithm leverages prior work in Assumption-based Truth Maintenance Systems to compactly record and reason over the family of Simple Temporal Problems. We also allow Drake to reason over temporal uncertainty and choices by using prior work in Simple Temporal Problems with Uncertainty, which can guarantee correct execution, regardless of the uncertain outcomes. On randomly generated structured plans with choice, framed as either Temporal Plan Networks or Disjunctive Temporal Problems, we show a reduction in the size of the solution set of around four orders of magnitude, compared to prior art.

Neurons That Confuse Mirror-Symmetric Object Views

2010年12月31日 00:00:00 GMT

Neurons That Confuse Mirror-Symmetric Object Views Mutch, Jim; Leibo, Joel Z; Smale, Steve; Rosasco, Lorenzo; Poggio, Tomaso Neurons in inferotemporal cortex that respond similarly to many pairs of mirror-symmetric images -- for example, 45 degree and -45 degree views of the same face -- have often been reported. The phenomenon seemed to be an interesting oddity. However, the same phenomenon has also emerged in simple hierarchical models of the ventral stream. Here we state a theorem characterizing sufficient conditions for this curious invariance to occur in a rather large class of hierarchical networks and demonstrate it with simulations.

Learning Generic Invariances in Object Recognition: Translation and Scale

2010年12月30日 00:00:00 GMT

Learning Generic Invariances in Object Recognition: Translation and Scale Leibo, Joel Z; Mutch, Jim; Rosasco, Lorenzo; Ullman, Shimon; Poggio, Tomaso Invariance to various transformations is key to object recognition but existing definitions of invariance are somewhat confusing while discussions of invariance are often confused. In this report, we provide an operational definition of invariance by formally defining perceptual tasks as classification problems. The definition should be appropriate for physiology, psychophysics and computational modeling. For any specific object, invariance can be trivially ``learned'' by memorizing a sufficient number of example images of the transformed object. While our formal definition of invariance also covers such cases, this report focuses instead on invariance from very few images and mostly on invariances from one example. Image-plane invariances -- such as translation, rotation and scaling -- can be computed from a single image for any object. They are called generic since in principle they can be hardwired or learned (during development) for any object. In this perspective, we characterize the invariance range of a class of feedforward architectures for visual recognition that mimic the hierarchical organization of the ventral stream. We show that this class of models achieves essentially perfect translation and scaling invariance for novel images. In this architecture a new image is represented in terms of weights of "templates" (e.g. "centers" or "basis functions") at each level in the hierarchy. Such a representation inherits the invariance of each template, which is implemented through replication of the corresponding "simple" units across positions or scales and their "association" in a "complex" unit. We show simulations on real images that characterize the type and number of templates needed to support the invariant recognition of novel objects. We find that 1) the templates need not be visually similar to the target objects and that 2) a very small number of them is sufficient for good recognition. These somewhat surprising empirical results have intriguing implications for the learning of invariant recognition during the development of a biological organism, such as a human baby. In particular, we conjecture that invariance to translation and scale may be learned by the association -- through temporal contiguity -- of a small number of primal templates, that is patches extracted from the images of an object moving on the retina across positions and scales. The number of templates can later be augmented by bootstrapping mechanisms using the correspondence provided by the primal templates -- without the need of temporal contiguity.

Conservative Rationalizability and The Second-Knowledge Mechanism

2010年12月20日 00:00:00 GMT

Conservative Rationalizability and The Second-Knowledge Mechanism Chen, Jing; Micali, Silvio In mechanism design, the traditional way of modeling the players' incomplete information about their opponents is "assuming a Bayesian." This assumption, however, is very strong and does not hold in many real applications. Accordingly, we put forward (1) a set-theoretic way to model the knowledge that a player might have about his opponents, and (2) a new class of mechanisms capable of leveraging such more conservative knowledge in a robust way. In auctions of a single good, we show that such a new mechanism can perfectly guarantee a revenue benchmark (always lying in between the second highest and the highest valuation) that no classical mechanism can even approximate in any robust way.

Conservative-Bayesian Mechanism Design

2010年12月20日 00:00:00 GMT

Conservative-Bayesian Mechanism Design Azar, Pablo; Chen, Jing; Micali, Silvio Classical Bayesian mechanism design is "centralized," that is, the designer is assumed to know the distribution D from which the players' type profile has been drawn. We instead investigate a very "decentralized" Bayesian model, where the designer has no knowledge at all, and each player only has some probabilistic information about D. For this decentralized model and many contexts of interest, where the goal is to maximize revenue, we show that, for arbitrary type distributions D (in particular, correlated ones), it is possible to design mechanisms matching to a significant extent the performance of the optimal centralized mechanisms. Our results are "existential" for a broad class of contexts (including combinatorial auctions) and "constructive" for auctions of a single good.

Heracles: Fully Synthesizable Parameterized MIPS-Based Multicore System

2010年12月08日 00:00:00 GMT

Heracles: Fully Synthesizable Parameterized MIPS-Based Multicore System Kinsy, Michel; Pellauer, Michael Heracles is an open-source complete multicore system written in Verilog. It is fully parameterized and can be reconfigured and synthesized into different topologies and sizes. Each processing node has a 7-stage pipeline, fully bypassed, microprocessor running the MIPS-III ISA, a 4-stage input-buffer, virtual-channel router, and a local variable-size shared memory. Our design is highly modular with clear interfaces between the core, the memory hierarchy, and the on-chip network. In the baseline design, the microprocessor is attached to two caches, one instruction cache and one data cache, which are oblivious to the global memory organization. The memory system in Heracles can be configured as one single global shared memory (SM), or distributed shared memory (DSM), or any combination thereof. Each core is connected to the rest of the network of processors by a parameterized, realistic, wormhole router. We show different topology configurations of the system, and their synthesis results on the Xilinx Virtex-5 LX330T FPGA board. We also provide a small MIPS cross-compiler toolchain to assist in developing software for Heracles.

From primal templates to invariant recognition

2010年12月04日 00:00:00 GMT

From primal templates to invariant recognition Leibo, Joel Z; Mutch, Jim; Ullman, Shimon; Poggio, Tomaso We can immediately recognize novel objects seen only once before -- in different positions on the retina and at different scales (distances). Is this ability hardwired by our genes or learned during development -- and if so how? We present a computational proof that developmental learning of invariance in recognition is possible and can emerge rapidly. This computational work sets the stage for experiments on the development of object invariance while suggesting a specific mechanism that may be critically tested.

Verification of Semantic Commutativity Conditions and Inverse Operations on Linked Data Structures

2010年12月03日 00:00:00 GMT

Verification of Semantic Commutativity Conditions and Inverse Operations on Linked Data Structures Kim, Deokhwan; Rinard, Martin C. Commuting operations play a critical role in many parallel computing systems. We present a new technique for verifying commutativity conditions, which are logical formulas that characterize when operations commute. Because our technique reasons with the abstract state of verified linked data structure implementations, it can verify commuting operations that produce semantically equivalent (but not identical) data structure states in different execution orders. We have used this technique to verify sound and complete commutativity conditions for all pairs of operations on a collection of linked data structure implementations, including data structures that export a set interface (ListSet and HashSet) as well as data structures that export a map interface (AssociationList, HashTable, and ArrayList). This effort involved the specification and verification of 765 commutativity conditions. Many speculative parallel systems need to undo the effects of speculatively executed operations. Inverse operations, which undo these effects, are often more efficient than alternate approaches (such as saving and restoring data structure state). We present a new technique for verifying such inverse operations. We have specified and verified, for all of our linked data structure implementations, an inverse operation for every operation that changes the data structure state. Together, the commutativity conditions and inverse operations provide a key resource that language designers and system developers can draw on to build parallel languages and systems with strong correctness guarantees.

LEAP Scratchpads: Automatic Memory and Cache Management for Reconfigurable Logic [Extended Version]

2010年11月23日 00:00:00 GMT

LEAP Scratchpads: Automatic Memory and Cache Management for Reconfigurable Logic [Extended Version] Adler, Michael; Fleming, Kermin E.; Parashar, Angshuman; Pellauer, Michael; Emer, Joel Developers accelerating applications on FPGAs or other reconfigurable logic have nothing but raw memory devices in their standard toolkits. Each project typically includes tedious development of single-use memory management. Software developers expect a programming environment to include automatic memory management. Virtual memory provides the illusion of very large arrays and processor caches reduce access latency without explicit programmer instructions. LEAP scratchpads for reconfigurable logic dynamically allocate and manage multiple, independent, memory arrays in a large backing store. Scratchpad accesses are cached automatically in multiple levels, ranging from shared on-board, RAM-based, set-associative caches to private caches stored in FPGA RAM blocks. In the LEAP framework, scratchpads share the same interface as on-die RAM blocks and are plug-in replacements. Additional libraries support heap management within a storage set. Like software developers, accelerator authors using scratchpads may focus more on core algorithms and less on memory management. Two uses of FPGA scratchpads are analyzed: buffer management in an H.264 decoder and memory management within a processor microarchitecture timing model. CORRECTION: The authors for entry [4] in the references should have been "E. S. Chung, J. C. Hoe, and K. Mai".

Scalable directoryless shared memory coherence using execution migration

2010年11月22日 00:00:00 GMT

Scalable directoryless shared memory coherence using execution migration Lis, Mieszko; Shim, Keun Sup; Cho, Myong Hyon; Khan, Omer; Devadas, Srinivas We introduce the concept of deadlock-free migration-based coherent shared memory to the NUCA family of architectures. Migration-based architectures move threads among cores to guarantee sequential semantics in large multicores. Using a execution migration (EM) architecture, we achieve performance comparable to directory-based architectures without using directories: avoiding automatic data replication significantly reduces cache miss rates, while a fast network-level thread migration scheme takes advantage of shared data locality to reduce remote cache accesses that limit traditional NUCA performance. EM area and energy consumption are very competitive, and, on the average, it outperforms a directory-based MOESI baseline by 6.8% and a traditional S-NUCA design by 9.2%. We argue that with EM scaling performance has much lower cost and design complexity than in directory-based coherence and traditional NUCA architectures: by merely scaling network bandwidth from 128 to 256 (512) bit flits, the performance of our architecture improves by an additional 8% (12%), while the baselines show negligible improvement.

One-Shot Learning with a Hierarchical Nonparametric Bayesian Model

2010年10月13日 00:00:00 GMT

One-Shot Learning with a Hierarchical Nonparametric Bayesian Model Salakhutdinov, Ruslan; Tenenbaum, Josh; Torralba, Antonio We develop a hierarchical Bayesian model that learns to learn categories from single training examples. The model transfers acquired knowledge from previously learned categories to a novel category, in the form of a prior over category means and variances. The model discovers how to group categories into meaningful super-categories that express different priors for new classes. Given a single example of a novel category, we can efficiently infer which super-category the novel category belongs to, and thereby estimate not only the new category's mean but also an appropriate similarity metric based on parameters inherited from the super-category. On MNIST and MSR Cambridge image datasets the model learns useful representations of novel categories based on just a single training example, and performs significantly better than simpler hierarchical Bayesian approaches. It can also discover new categories in a completely unsupervised fashion, given just one or a few examples.

Generalization and Properties of the Neural Response

2010年11月19日 00:00:00 GMT

Generalization and Properties of the Neural Response Bouvrie, Jake; Poggio, Tomaso; Rosasco, Lorenzo; Smale, Steve; Wibisono, Andre Hierarchical learning algorithms have enjoyed tremendous growth in recent years, with many new algorithms being proposed and applied to a wide range of applications. However, despite the apparent success of hierarchical algorithms in practice, the theory of hierarchical architectures remains at an early stage. In this paper we study the theoretical properties of hierarchical algorithms from a mathematical perspective. Our work is based on the framework of hierarchical architectures introduced by Smale et al. in the paper "Mathematics of the Neural Response", Foundations of Computational Mathematics, 2010. We propose a generalized definition of the neural response and derived kernel that allows us to integrate some of the existing hierarchical algorithms in practice into our framework. We then use this generalized definition to analyze the theoretical properties of hierarchical architectures. Our analysis focuses on three particular aspects of the hierarchy. First, we show that a wide class of architectures suffers from range compression; essentially, the derived kernel becomes increasingly saturated at each layer. Second, we show that the complexity of a linear architecture is constrained by the complexity of the first layer, and in some cases the architecture collapses into a single-layer linear computation. Finally, we characterize the discrimination and invariance properties of the derived kernel in the case when the input data are one-dimensional strings. We believe that these theoretical results will provide a useful foundation for guiding future developments within the theory of hierarchical algorithms.

A Tree-Based Context Model for Object Recognition

2010年10月29日 00:00:00 GMT

A Tree-Based Context Model for Object Recognition Choi, Myung Jin; Lim, Joseph J.; Torralba, Antonio; Willsky, Alan S. There has been a growing interest in exploiting contextual information in addition to local features to detect and localize multiple object categories in an image. A context model can rule out some unlikely combinations or locations of objects and guide detectors to produce a semantically coherent interpretation of a scene. However, the performance benefit of context models has been limited because most of the previous methods were tested on datasets with only a few object categories, in which most images contain one or two object categories. In this paper, we introduce a new dataset with images that contain many instances of different object categories, and propose an efficient model that captures the contextual information among more than a hundred object categories using a tree structure. Our model incorporates global image features, dependencies between object categories, and outputs of local detectors into one probabilistic framework. We demonstrate that our context model improves object recognition performance and provides a coherent interpretation of a scene, which enables a reliable image querying system by multiple object categories. In addition, our model can be applied to scene understanding tasks that local detectors alone cannot solve, such as detecting objects out of context or querying for the most typical and the least typicalscenes in a dataset.

SEEC: A Framework for Self-aware Computing

2010年10月13日 00:00:00 GMT

SEEC: A Framework for Self-aware Computing Hoffmann, Henry; Maggio, Martina; Santambrogio, Marco D.; Leva, Alberto; Agarwal, Anant As the complexity of computing systems increases, application programmers must be experts in their application domain and have the systems knowledge required to address the problems that arise from parallelism, power, energy, and reliability concerns. One approach to relieving this burden is to make use of self-aware computing systems, which automatically adjust their behavior to help applications achieve their goals. This paper presents the SEEC framework, a unified computational model designed to enable self-aware computing in both applications and system software. In the SEEC model, applications specify goals, system software specifies possible actions, and the SEEC framework is responsible for deciding how to use the available actions to meet the application-specified goals. The SEEC framework is built around a general and extensible control system which provides predictable behavior and allows SEEC to make decisions that achieve goals while optimizing resource utilization. To demonstrate the applicability of the SEEC framework, this paper presents fivedifferent self-aware systems built using SEEC. Case studies demonstrate how these systems can control the performance of the PARSEC benchmarks, optimize performance per Watt for a video encoder, and respond to unexpected changes in the underlying environment. In general these studies demonstrate that systems built using the SEEC framework are goal-oriented, predictable, adaptive, and extensible.

Audit Trails in the Aeolus Distributed Security Platform

2010年9月29日 00:00:00 GMT

Audit Trails in the Aeolus Distributed Security Platform Popic, Victoria This thesis provides a complete design and implementation of audit trail collection and storage for Aeolus, a distributed security platform based on information flow control. An information flow control system regulates all activities that concern information security. By recording all the operations monitored by Aeolus, our audit trails capture all actions that can affect system security. In our system, event records are collected on each system node and shipped to a centralized location, where they are stored and processed. To correlate audit trail events of different system nodes we store event dependencies directly in the event records. Each audit trail record keeps links to its immediate predecessors. Therefore, our audit trails form dependency graphs that capture the causal relationship among system events. These graphs can be used to reconstruct the chains of events leading to a given system state. Our results show that audit trail collection imposes a small overhead on system performance. MEng thesis

Bayesian perceptual inference in linear Gaussian models

2010年9月21日 00:00:00 GMT

Bayesian perceptual inference in linear Gaussian models Battaglia, Peter W. The aim of this paper is to provide perceptual scientists with a quantitative framework for modeling a variety of common perceptual behaviors, and to unify various perceptual inference tasks by exposing their common computational underpinnings. This paper derives a model Bayesian observer for perceptual contexts with linear Gaussian generative processes. I demonstrate the relationship between four fundamental perceptual situations by expressing their corresponding posterior distributions as consequences of the model's predictions under their respective assumptions.

A File Location, Replication, and Distribution System for Network Information to Aid Network Management

2010年9月22日 00:00:00 GMT

A File Location, Replication, and Distribution System for Network Information to Aid Network Management Cheng, Tiffany This thesis demonstrates and evaluates the design, architecture, and implementation of a file location, replication, and distribution system built with the objective of managing information in an Internet network. The system's goal is to enable the availability of information by providing alternative locations for files in case of situations where the original piece of information cannot be found in the network due to failures or other problems. The system provides the mechanism for duplicating files and executes the act of placing them in multiple locations according to predefined rules for distribution. The resulting system is a working model for a file management system that can exist over the Internet and will aid in overall network management by organizing and overseeing the information found within a network. MEng thesis

Learning Solutions of Similar Linear Programming Problems using Boosting Trees

2010年9月18日 00:00:00 GMT

Learning Solutions of Similar Linear Programming Problems using Boosting Trees Banerjee, Ashis Gopal; Roy, Nicholas In many optimization problems, similar linear programming (LP) problems occur in the nodes of the branch and bound trees that are used to solve integer (mixed or pure, deterministic or stochastic) programming problems. Similar LP problems are also found in problem domains where the objective function and constraint coefficients vary due to uncertainties in the operating conditions. In this report, we present a regression technique for learning a set of functions that map the objective function and the constraints to the decision variables of such an LP system by modifying boosting trees, an algorithm we term the Boost-LP algorithm. Matrix transformations and geometric properties of boosting trees are utilized to provide theoretical performance guarantees on the predicted values. The standard form of the loss function is altered to reduce the possibility of generating infeasible LP solutions. Experimental results on three different problems, one each on scheduling, routing, and planning respectively, demonstrate the effectiveness of the Boost-LP algorithm in providing significant computational benefits over regular optimization solvers without generating solutions that deviate appreciably from the optimum values.

Conservative-Bayesian Mechanisms

2010年9月08日 00:00:00 GMT

Conservative-Bayesian Mechanisms Azar, Pablo; Chen, Jing; Micali, Silvio We put forward a new class of mechanisms. In this extended abstract, we exemplify our approach only for single-good auctions in what we call a conservative-Bayesian setting. (Essentially, no common-knowledge about the underlying distribution of the players' valuations is required.) We prove that our mechanism is optimal in this challenging and realistic setting.

Practical Color-Based Motion Capture

2010年9月10日 00:00:00 GMT

Practical Color-Based Motion Capture Wang, Robert; Paris, Sylvain; Popovic, Jovan Motion capture systems have been widely used for high quality content creation and virtual reality but are rarely used in consumer applications due to their price and setup cost. In this paper, we propose a motion capture system built from commodity components that can be deployed in a matter of minutes. Our approach uses one or more webcams and a color shirt to track the upper-body at interactive rates. We describe a robust color calibration system that enables our color-based tracking to work against cluttered backgrounds and under multiple illuminants. We demonstrate our system in several real-world indoor and outdoor settings.

Reliably Detecting Connectivity using Local Graph Traits

2010年9月09日 00:00:00 GMT

Reliably Detecting Connectivity using Local Graph Traits Cornejo, Alejandro; Lynch, Nancy Local distributed algorithms can only gather sufficient information to identify local graph traits, that is, properties that hold within the local neighborhood of each node. However, it is frequently the case that global graph properties (connectivity, diameter, girth, etc) have a large influence on the execution of a distributed algorithm. This paper studies local graph traits and their relationship with global graph properties. Specifically, we focus on graph k-connectivity. First we prove a negative result that shows there does not exist a local graph trait which perfectly captures graph k-connectivity. We then present three different local graph traits which can be used to reliably predict the k-connectivity of a graph with varying degrees of accuracy. As a simple application of these results, we present upper and lower bounds for a local distributed algorithm which determines if a graph is k-connected. As a more elaborate application of local graph traits, we describe, and prove the correctness of, a local distributed algorithm that preserves k-connectivity in mobile ad hoc networks while allowing nodes to move independently whenever possible.

An Overview of MOOS-IvP and a Users Guide to the IvP Helm Autonomy Software

2010年8月27日 00:00:00 GMT

An Overview of MOOS-IvP and a Users Guide to the IvP Helm Autonomy Software Benjamin, Michael R.; Newman, Paul; Schmidt, Henrik; Leonard, John J. This document describes the IvP Helm -- an Open Source behavior-based autonomy application for unmanned vehicles. IvP is short for interval programming -- a technique for representing and solving multi-objective optimizations problems. Behaviors in the IvP Helm are reconciled using multi-objective optimization when in competition with each other for influence of the vehicle. The IvP Helm is written as a MOOS application where MOOS is a set of Open Source publish-subscribe autonomy middleware tools. This document describes the configuration and use of the IvP Helm, provides examples of simple missions and information on how to download and build the software from the MOOS-IvP server at www.moosivp.org.

The Abstract MAC Layer

2010年8月26日 00:00:00 GMT

The Abstract MAC Layer Kuhn, Fabian; Lynch, Nancy; Newport, Calvin A diversity of possible communication assumptions complicates the study of algorithms and lower bounds for radio networks. We address this problem by defining an abstract MAC layer. This service provides reliable local broadcast communication, with timing guarantees stated in terms of a collection of abstract delay functions applied to the relevant contention. Algorithm designers can analyze their algorithms in terms of these functions, independently of specific channel behavior. Concrete implementations of the abstract MAC Layer over basic radio network models generate concrete definitions for these delay functions, automatically adapting bounds proven for the abstract service to bounds for the specific radio network under consideration. To illustrate this approach, we use the abstract MAC Layer to study the new problem of Multi-Message Broadcast, a generalization of standard single-message broadcast in which multiple messages can originate at different times and locations in the network. We present and analyze two algorithms for Multi-Message Broadcast in static networks: a simple greedy algorithm and one that uses regional leaders. We then indicate how these results can be extended to mobile networks.

MOOS-IvP Autonomy Tools Users Manual

2010年8月23日 00:00:00 GMT

MOOS-IvP Autonomy Tools Users Manual Benjamin, Michael R. This document describes fifteen MOOS-IvP autonomy tools. uHelmScope provides a run-time scoping window into the state of an active IvP Helm executing its mission. pMarineViewer is a geo-based GUI tool for rendering marine vehicles and geometric data in their operational area. uXMS is a terminal based tool for scoping on a MOOSDB process. uTermCommand is a terminal based tool for poking a MOOSDB with a set of MOOS file pre-defined variable-value pairs selectable with aliases from the command-line. pEchoVar provides a way of echoing a post to one MOOS variable with a new post having the same value to a different variable. uProcessWatch monitors the presence or absence of a set of MOOS processes and summarizes the collective status in a single MOOS variable. uPokeDB provides a way of poking the MOOSDB from the command line with one or more variable-value pairs without any pre-existing configuration of a MOOS file. uTimerScript will execute a pre-defined timed pausable script of poking variable-value pairs to a MOOSDB. pNodeReporter summarizes a platforms critical information into a single node report string for sharing beyond the vehicle. pBasicContactMgr provides a basic contact management service with the ability to generate range-dependent configurable alerts. The Alog Toolbox is a set of offline tools for analyzing and manipulating log files in the .alog format.

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) v1.1

2010年5月12日 00:00:00 GMT

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) v1.1 Finlayson, Mark Alan; Hervas, Raquel The corpus comprises 62 files in "Story Workbench" annotation format: 30 folktales in English from a variety of sources, and 32 Wall Street Journal articles selected to coincide with articles found in the Penn Treebank. The files are annotated with the location of referring expressions, coreference relations between the referring expressions, and so-called "indication structures", which split referring expressions into constituents (nuclei and modifiers) and mark each constituent as either 'distinctive' or 'descriptive', indicating whether or not the constituent contains information required for uniquely identifying the referent. The files distributed in this corpus archive are the gold-standard files, which were constructed by merging annotations done by two trained annotators. The contents of this corpus, the annotation procedure, and the indication structures are described in more detail in a paper titled "The Prevalence of Descriptive Referring Expressions in News and Narrative" published in the proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, held in July 2010 in Uppsala, Sweden (ACL-2010). A near-final version of the paper is included in the doc/ directory of the compressed corpus archive file. This is version 1.1 of the UMIREC corpus, in which the coreference annotations have been fixed relative to version 1.0. UMIREC v1.0 suffered from a bug in the export script that corrupted the coreference data.

Parallelizing Sequential Programs With Statistical Accuracy Tests

2010年8月05日 00:00:00 GMT

Parallelizing Sequential Programs With Statistical Accuracy Tests Misailovic, Sasa; Kim, Deokhwan; Rinard, Martin We present QuickStep, a novel system for parallelizing sequential programs. QuickStep deploys a set of parallelization transformations that together induce a search space of candidate parallel programs. Given a sequential program, representative inputs, and an accuracy requirement, QuickStep uses performance measurements, profiling information, and statistical accuracy tests on the outputs of candidate parallel programs to guide its search for a parallelizationthat maximizes performance while preserving acceptable accuracy. When the search completes, QuickStep produces an interactive report that summarizes the applied parallelization transformations, performance, and accuracy results for the automatically generated candidate parallel programs. In our envisioned usage scenarios, the developer examines this report to evaluate the acceptability of the final parallelization and to obtain insight into how the original sequential program responds to different parallelization strategies. Itis also possible for the developer (or even a user of the program who has no software development expertise whatsoever) to simply use the best parallelization out of the box without examining the report or further investigating the parallelization. Results from our benchmark set of applications show that QuickStep can automatically generate accurate and efficient parallel programs---the automatically generated parallel versions of five of our six benchmark applications run between 5.0 and 7.7 times faster on 8 cores than the original sequential versions. Moreover, a comparison with the Intel icc compiler highlights how QuickStep can effectively parallelize applications with features (such as the use of modern object-oriented programming constructs or desirable parallelizations with infrequent but acceptable data races) that place them inherently beyond the reach of standard approaches.

An Efficient Learning Procedure for Deep Boltzmann Machines

2010年8月04日 00:00:00 GMT

An Efficient Learning Procedure for Deep Boltzmann Machines Salakhutdinov, Ruslan; Hinton, Geoffrey We present a new learning algorithm for Boltzmann Machines that contain many layers of hidden variables. Data-dependent statistics are estimated using a variational approximation that tends to focus on a single mode, and data-independent statistics are estimated using persistent Markov chains. The use of two quite different techniques for estimating the two types of statistic that enter into the gradient of the log likelihood makes it practical to learn Boltzmann Machines with multiple hidden layers and millions of parameters. The learning can be made more efficient by using a layer-by-layer "pre-training" phase that initializes the weights sensibly. The pre-training also allows the variational inference to be initialized sensibly with a single bottom-up pass. We present results on the MNIST and NORB datasets showing that Deep Boltzmann Machines learn very good generative models of hand-written digits and 3-D objects. We also show that the features discovered by Deep Boltzmann Machines are a very effective way to initialize the hidden layers of feed-forward neural nets which are then discriminatively fine-tuned.

MAC Design for Analog Network Coding

2010年8月02日 00:00:00 GMT

MAC Design for Analog Network Coding Khabbazian, Majid; Kuhn, Fabian; Lynch, Nancy; Medard, Muriel; ParandehGheibi, Ali Most medium access control mechanisms discard collided packets and consider interference harmful. Recent work on Analog Network Coding (ANC) suggests a different approach, in which multiple interfering transmissions are strategically scheduled. The received collisions are collected and then used in a decoding process, such as the ZigZag decoding process, where the packets involved in the collisions are extracted. In this paper, we present an algebraic representation of collisions and describe a general approach to recovering collisions using ANC. To study the eect of using ANC on the performance of MAC layers, we develop an ANC-based algorithm that implements an abstract MAC layer service, as defined in [1, 2], and analyze its performance. This study proves that ANC can significantly improve the performance of MAC layer services, in terms of probabilistic time guarantees for packet delivery. We illustrate how this improvement at the MAC layer can translate into faster higher-level algorithms, by analyzing the time complexity of a multiple-message network-wide broadcast algorithm that uses our ANC-based MAC service.

Learning and Invariance in a Family of Hierarchical Kernels

2010年7月30日 00:00:00 GMT

Learning and Invariance in a Family of Hierarchical Kernels Wibisono, Andre; Bouvrie, Jake; Rosasco, Lorenzo; Poggio, Tomaso Understanding invariance and discrimination properties of hierarchical models is arguably the key to understanding how and why such models, of which the the mammalian visual system is one instance, can lead to good generalization properties and reduce the sample complexity of a given learning task. In this paper we explore invariance to transformation and the role of layer-wise embeddings within an abstract framework of hierarchical kernels motivated by the visual cortex. Here a novel form of invariance is induced by propagating the effect of locally defined, invariant kernels throughout a hierarchy. We study this notion of invariance empirically. We then present an extension of the abstract hierarchical modeling framework to incorporate layer-wise embeddings, which we demonstrate can lead to improved generalization and scalable algorithms. Finally we analyze experimentally sample complexity properties as a function of architectural parameters.

Examining high level neural representations of cluttered scenes

2010年7月29日 00:00:00 GMT

Examining high level neural representations of cluttered scenes Meyers, Ethan; Embark, Hamdy; Freiwald, Winrich; Serre, Thomas; Kreiman, Gabriel; Poggio, Tomaso Humans and other primates can rapidly categorize objects even when they are embedded in complex visual scenes (Thorpe et al., 1996; Fabre-Thorpe et al., 1998). Studies by Serre et al., 2007 have shown that the ability of humans to detect animals in brief presentations of natural images decreases as the size of the target animal decreases and the amount of clutter increases, and additionally, that a feedforward computational model of the ventral visual system, originally developed to account for physiological properties of neurons, shows a similar pattern of performance. Motivated by these studies, we recorded single- and multi-unit neural spiking activity from macaque superior temporal sulcus (STS) and anterior inferior temporal cortex (AIT), as a monkey passively viewed images of natural scenes. The stimuli consisted of 600 images of animals in natural scenes, and 600 images of natural scenes without animals in them, captured at four different viewing distances, and were the same images used by Serre et al. to allow for a direct comparison between human psychophysics, computational models, and neural data. To analyze the data, we applied population "readout" techniques (Hung et al., 2005; Meyers et al., 2008) to decode from the neural activity whether an image contained an animal or not. The decoding results showed a similar pattern of degraded decoding performance with increasing clutter as was seen in the human psychophysics and computational model results. However, overall the decoding accuracies from the neural data lower were than that seen in the computational model, and the latencies of information in IT were long (~125ms) relative to behavioral measures obtained from primates in other studies. Additional tests also showed that the responses of the model units were not capturing several properties of the neural responses, and that detecting animals in cluttered scenes using simple model units based on V1 cells worked almost as well as using more complex model units that were designed to model the responses of IT neurons. While these results suggest AIT might not be the primary brain region involved in this form of rapid categorization, additional studies are needed before drawing strong conclusions.

Characteristics of Small Social Networks

2010年7月27日 00:00:00 GMT

Characteristics of Small Social Networks Richards, Whitman; Macindoe, Owen Belief Dynamics Two dozen networks are analyzed using three parameters that attempt to capture important properties of social networks: leadership L, member bonding B, and diversity of expertise D. The first two of these parameters have antecedents, the third is new. A key part of the analysis is to examine networks at multiple scales by dissecting the entire network into its n subgraphs of a given radius of two edge steps about each of the n nodes. This scale-based analysis reveals constraints on what we have dubbed "cognitive" networks, as contrasted with biological or physical networks. Specifically, "cognitive" networks appear to maximize bonding and diversity over a range of leadership dominance. Asymptotic relations between the bonding and diversity measures are also found when small, nearly complete subgraphs are aggregated to form larger networks. This aggregation probably underlies changes in a regularity among the LBD parameters; this regularity is a U-shaped function of networks size, n, which is minimal for networks around 80 or so nodes.

Language and Compiler Support for Auto-Tuning Variable-Accuracy Algorithms

2010年7月27日 00:00:00 GMT

Language and Compiler Support for Auto-Tuning Variable-Accuracy Algorithms Ansel, Jason; Wong, Yee Lok; Chan, Cy; Olszewski, Marek; Edelman, Alan; Amarasinghe, Saman Approximating ideal program outputs is a common technique for solving computationally difficult problems, for adhering to processing or timing constraints, and for performance optimization in situations where perfect precision is not necessary. To this end, programmers often use approximation algorithms, iterative methods, data resampling, and other heuristics. However, programming such variable accuracy algorithms presents difficult challenges since the optimal algorithms and parameters may change with different accuracy requirements and usage environments. This problem is further compounded when multiple variable accuracy algorithms are nested together due to the complex way that accuracy requirements can propagate across algorithms and because of the resulting size of the set of allowable compositions. As a result, programmers often deal with this issue in an ad-hoc manner that can sometimes violate sound programming practices such as maintaining library abstractions. In this paper, we propose language extensions that expose trade-offs between time and accuracy to the compiler. The compiler performs fully automatic compile-time and install-time autotuning and analyses in order to construct optimized algorithms to achieve any given target accuracy. We present novel compiler techniques and a structured genetic tuning algorithm to search the space of candidate algorithms and accuracies in the presence of recursion and sub-calls to other variable accuracy code. These techniques benefit both the library writer, by providing an easy way to describe and search the parameter and algorithmic choice space, and the library user, by allowing high level specification of accuracy requirements which are then met automatically without the need for the user to understand any algorithm-specific parameters. Additionally, we present a new suite of benchmarks, written in our language, to examine the efficacy of our techniques. Our experimental results show that by relaxing accuracy requirements, we can easily obtain performance improvements ranging from 1.1x to orders of magnitude of speedup.

ChitChat: Making Video Chat Robust to Packet Loss

2010年7月05日 00:00:00 GMT

ChitChat: Making Video Chat Robust to Packet Loss Wang, Jue; Katabi, Dina Video chat is increasingly popular among Internet users. Often, however, chatting sessions suffer from packet loss, which causes video outage and poor quality. Existing solutions however are unsatisfying. Retransmissions increase the delay and hence can interact negatively with the strict timing requirements of interactive video. FEC codes introduce extra overhead and hence reduce the bandwidth available for video data even in the absence of packet loss. This paper presents ChitChat, a new approach for reliable video chat that neither delays frames nor introduces bandwidth overhead. The key idea is to ensure that the information in each packet describes the whole frame. As a result, even when some packets are lost, the receiver can still use the received packets to decode a smooth version of the original frame. This reduces frame loss and the resulting video freezes and improves the perceived video quality. We have implemented ChitChat and evaluated it over multiple Internet paths. In comparison to Windows Live Messenger 2009, our method reduces the occurrences of video outage events by more than an order of magnitude.

EM2: A Scalable Shared-Memory Multicore Architecture

2010年6月12日 00:00:00 GMT

EM2: A Scalable Shared-Memory Multicore Architecture Khan, Omer; Lis, Mieszko; Devadas, Srini We introduce the Execution Migration Machine (EM2), a novel, scalable shared-memory architecture for large-scale multicores constrained by off-chip memory bandwidth. EM2 reduces cache miss rates, and consequently off-chip memory usage, by permitting only one copy of data to be stored anywhere in the system: when a thread wishes to access an address not locally cached on the core it is executing on, it migrates to the appropriate core and continues execution. Using detailed simulations of a range of 256-core configurations on the SPLASH-2 benchmark suite, we show that EM2 improves application completion times by 18% on the average while remaining competitive with traditional architectures in silicon area.

Broadcasting in Unreliable Radio Networks

2010年6月08日 00:00:00 GMT

Broadcasting in Unreliable Radio Networks Oshman, Rotem; Richa, Andrea; Newport, Calvin; Lynch, Nancy; Kuhn, Fabian Practitioners agree that unreliable links, which fluctuate between working and not working, are an important characteristic of wireless networks. In contrast, most theoretical models of radio networks fix a static set of links and assume that these links work reliably throughout an execution. This gap between theory and practice motivates us to investigate how unreliable links affect theoretical bounds on broadcast in radio networks. To that end we consider a model that includes two types of links: reliable links, which always deliver messages, and unreliable links, which sometimes deliver messages and sometimes do not. It is assumed that the graph induced by the reliable links is connected, and unreliable links are controlled by a worst-case adversary. In the new model we show an(n log n) lower bound on deterministic broadcast in undirected graphs, even when all processes are initially awake and have collision detection, and an (n) lower bound on randomized broadcast in undirected networks of constant diameter. This clearly separates the new model from the classical, reliable model. On the positive side, we give two algorithms that tolerate the inherent unreliability: an O(n3=2plog n)-time deterministic algorithm and a randomized algorithm which terminates in O(n log2 n) rounds with high probability.

iJam: Jamming Oneself for Secure Wireless Communication

2010年6月07日 00:00:00 GMT

iJam: Jamming Oneself for Secure Wireless Communication Katabi, Dina; Gollakota, Shyamnath Wireless is inherently less secure than wired networks because of its broadcast nature. Attacks that simply snoop on the wireless medium successfully defeat the security of even 802.11 networks using the most recent security standards (WPA2-PSK). In this paper we ask the following question: Can we prevent this kind of eavesdropping from happening? If so, we can potentially defeat the entire class of attacks that rely on snooping. This paper presents iJam, a PHY-layer protocol for OFDM-based wireless systems. iJam ensures that an eavesdropper cannot successfully demodulate a wireless signal not intended for it. To achieve this iJam strategically introduces interference that prevents an eavesdropper from decoding the data, while allowing the intended receiver to decode it. iJam exploits the properties of 802.11â s OFDM signals to ensure that an eavesdropper cannot even tell which parts of the signal are jammed. We implement iJam and evaluate it in a testbed of GNURadios with an 802.11-like physical layer. We show that iJam makes the data bits at the adversary look random, i.e., the BER becomes close to 50%, whereas the receiver can perfectly decode the data.

Power-Aware Computing with Dynamic Knobs

2010年5月14日 00:00:00 GMT

Power-Aware Computing with Dynamic Knobs Misailovic, Sasa; Agarwal, Anant; Carbin, Michael; Sidiroglou, Stelios; Hoffmann, Henry; Rinard, Martin We present PowerDial, a system for dynamically adapting application behavior to execute successfully in the face of load and power fluctuations. PowerDial transforms static configuration parameters into dynamic knobs that the PowerDial control system can manipulate to dynamically trade off the accuracy of the computation in return for reductions in the computational resources that the application requires to produce its results. These reductions translate into power savings. Our experimental results show that PowerDial can enable our benchmark applications to execute responsively in the face of power caps (imposed, for example, in response to cooling system failures) that would otherwise significantly impair the delivered performance. They also show that PowerDial can reduce the number of machines required to meet peak load, in our experiments enabling up to a 75% reduction in direct power and capital costs.

SIFT Flow: Dense Correspondence across Scenes and its Applications

2010年5月08日 00:00:00 GMT

SIFT Flow: Dense Correspondence across Scenes and its Applications Freeman, William T.; Torralba, Antonio; Yuen, Jenny; Liu, Ce While image alignment has been studied in different areas of computer vision for decades, aligning images depicting different scenes remains a challenging problem. Analogous to optical flow where an image is aligned to its temporally adjacent frame, we propose SIFT flow, a method to align an image to its nearest neighbors in a large image corpus containing a variety of scenes. The SIFT flow algorithm consists of matching densely sampled, pixel-wise SIFT features between two images, while preserving spatial discontinuities. The SIFT features allow robust matching across different scene/object appearances, whereas the discontinuity-preserving spatial model allows matching of objects located at different parts of the scene. Experiments show that the proposed approach robustly aligns complex scene pairs containing significant spatial differences. Based on SIFT flow, we propose an alignment-based large database framework for image analysis and synthesis, where image information is transferred from the nearest neighbors to a query image according to the dense scene correspondence. This framework is demonstrated through concrete applications, such as motion field prediction from a single image, motion synthesis via object transfer, satellite image registration and face recognition.

Hierarchical Task and Motion Planning in the Now

2010年5月07日 00:00:00 GMT

Hierarchical Task and Motion Planning in the Now Kaelbling, Leslie Pack; Lozano-Perez, Tomas In this paper we outline an approach to the integration of task planning and motion planning that has the following key properties: It is aggressively hierarchical. It makes choices and commits to them in a top-down fashion in an attempt to limit the length of plans that need to be constructed, and thereby exponentially decrease the amount of search required. Importantly, our approach also limits the need to project the effect of actions into the far future. It operates on detailed, continuous geometric representations and partial symbolic descriptions. It does not require a complete symbolic representation of the input geometry or of the geometric effect of the task-level operations. Workshop on Mobile Manipulation, IEEE International Conference on Robotics and Automation

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus)

2010年5月12日 00:00:00 GMT

UCM/MIT Indications, Referring Expressions, and Coreference Corpus (UMIREC corpus) Hervas, Raquel; Finlayson, Mark Alan This version of the UMIREC corpus has been superseded by version 1.1, found at http://hdl.handle.net/1721.1/57507. Please do not use version 1.0, as it contains corrupted coreference information. The correct, uncorrupted data is found in version 1.1.

Annotation Guide for the UCM/MIT Indications, Referential Expressions, and Coreference Corpus (UMIREC Corpus)

2010年5月12日 00:00:00 GMT

Annotation Guide for the UCM/MIT Indications, Referential Expressions, and Coreference Corpus (UMIREC Corpus) Hervas, Raquel; Finlayson, Mark Alan This is the annotation guide given to the annotators who created the UCM/MIT Indications, Referring Expressions, and Coreference (UMIREC) Corpus version 1.0. The corpus comprises texts annotated for referring expressions, coreference relations between the referring expressions, and so-called "indication structures", which split referring expressions into constituents (nuclei and modifiers) and mark each constituent as either 'distinctive' or 'descriptive', which indicate whether or not the constituent contains information required for uniquely identifying the referent. The contents of this corpus, the annotation procedure, and the indication structures are described in more detail in a paper titled "The Prevalence of Descriptive Referring Expressions in News and Narrative" published in the proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, held in July 2010 in Uppsala, Sweden (ACL-2010).

A User Study Comparing 3D Modeling with Silhouettes and Google SketchUp

2010年5月05日 00:00:00 GMT

A User Study Comparing 3D Modeling with Silhouettes and Google SketchUp Igarashi, Takeo; Durand, Fredo; Rivers, Alec We describe a user study comparing 3D Modeling with Silhouettes and Google SketchUp. In the user study, ten users were asked to create 3D models of three different objects, using either 3D Modeling with Silhouettes or Google SketchUp. Ten different users were then asked to rank images of the models produced by the first group. We show that the models made with 3D Modeling with Silhouettes were ranked significantly higher on average than those made with Google SketchUp.

Automatic Error Finding in Access-Control Policies

2010年5月05日 00:00:00 GMT

Automatic Error Finding in Access-Control Policies Jayaraman, Karthick; Rinard, Martin C.; Tripunitara, Mahesh; Ganesh, Vijay; Chapin, Steve Access-control policies are a key infrastructural technology for computer security. However, a significant problem is that system administrators need to be able to automatically verify whether their policies capture the intended security goals. To address this important problem, researchers have proposed many automated verification techniques. Despite considerable progress in verification techniques, scalability is still a significant issue. Hence, in this paper we propose that error finding complements verification, and is a fruitful way of checking whether or not access control policies implement the security intent of system administrators. Error finding is more scalable (at the cost of completeness), and allows for the use of a wider variety of techniques. In this paper, we describe an abstraction-refinement based technique and its implementation, the Mohawk tool, aimed at finding errors in ARBAC access-control policies. The key insight behind our abstraction-refinement technique is that it is more efficient to look for errors in an abstract policy (with successive refinements, if necessary) than its complete counterpart. Mohawk accepts as input an access-control policy and a safety question. If Mohawk finds an error in the input policy, it terminates with a sequence of actions that cause the error. We provide an extensive comparison of Mohawk with the current state-of-the-art analysis tools. We show that Mohawk scales very well as the size and complexity of the input policies increase, and is orders of magnitude faster than competing tools. The Mohawk tool is open source and available from the Google Code website: http://code.google.com/p/mohawk/

The Bayes Tree: Enabling Incremental Reordering and Fluid Relinearization for Online Mapping

2010年1月29日 00:00:00 GMT

The Bayes Tree: Enabling Incremental Reordering and Fluid Relinearization for Online Mapping Kaess, Michael; Dellaert, Frank; Roberts, Richard; Ila, Viorela In this paper we present a novel data structure, the Bayes tree, which exploits the connections between graphical model inference and sparse linear algebra. The proposed data structure provides a new perspective on an entire class of simultaneous localization and mapping (SLAM) algorithms. Similar to a junction tree, a Bayes tree encodes a factored probability density, but unlike the junction tree it is directed and maps more naturally to the square root information matrix of the SLAM problem. This makes it eminently suited to encode the sparse nature of the problem, especially in a smoothing and mapping (SAM) context. The inherent sparsity of SAM has already been exploited in the literature to produce efficient solutions in both batch and online mapping. The graphical model perspective allows us to develop a novel incremental algorithm that seamlessly incorporates reordering and relinearization. This obviates the need for expensive periodic batch operations from previous approaches, which negatively affect the performance and detract from the intended online nature of the algorithm. The new method is evaluated using simulated and real-world datasets in both landmark and pose SLAM settings.

Optimizing MapReduce for Multicore Architectures

2010年5月02日 00:00:00 GMT

Optimizing MapReduce for Multicore Architectures Kaashoek, Frans; Morris, Robert; Mao, Yandong MapReduce is a programming model for data-parallel programs originally intended for data centers. MapReduce simplifies parallel programming, hiding synchronization and task management. These properties make it a promising programming model for future processors with many cores, and existing MapReduce libraries such as Phoenix have demonstrated that applications written with MapReduce perform competitively with those written with Pthreads. This paper explores the design of the MapReduce data structures for grouping intermediate key/value pairs, which is often a performance bottleneck on multicore processors. The paper finds the best choice depends on workload characteristics, such as the number of keys used by the application, the degree of repetition of keys, etc. This paper also introduces a new MapReduce library, Metis, with a compromise data structure designed to perform well for most workloads. Experiments with the Phoenix benchmarks on a 16-core AMD-based servershow that Metisâ data structure performs better than simpler alternatives, including Phoenix.

Instruction-Level Execution Migration

2010年4月17日 00:00:00 GMT

Instruction-Level Execution Migration Devadas, Srinivas; Lis, Mieszko; Khan, Omer We introduce the Execution Migration Machine (EM2), a novel data-centric multicore memory system architecture based on computation migration. Unlike traditional distributed memory multicores, which rely on complex cache coherence protocols to move the data to the core where the computation is taking place, our scheme always moves the computation to the core where the data resides. By doing away with the cache coherence protocol, we are able to boost the effectiveness of per-core caches while drastically reducing hardware complexity. To evaluate the potential of EM2 architectures, we developed a series of PIN/Graphite-based models of an EM2 multicore with 64 x86 cores and, under some simplifying assumptions (a timing model restricted to data memory performance, no instruction cache modeling, high-bandwidth fixed-latency interconnect allowing concurrent migrations), compared them against corresponding directory-based cache-coherent architecture models. We justify our assumptions and show that our conclusions are valid even if our assumptions are removed. Experimental results on a range of SPLASH-2 and PARSEC benchmarks indicate that EM2 can significantly improve per-core cache performance, decreasing overall miss rates by as much as 84% and reducing average memory latency by up to 58%.

Kongming: A Generative Planner for Hybrid Systems with Temporally Extended Goals

2010年4月09日 00:00:00 GMT

Kongming: A Generative Planner for Hybrid Systems with Temporally Extended Goals Li, Hui X. Most unmanned missions in space and undersea are commanded by a "script" that specifies a sequence of discrete commands and continuous actions. Currently such scripts are mostly hand-generated by human operators. This introduces inefficiency, puts a significant cognitive burden on the engineers, and prevents re-planning in response to environment disturbances or plan execution failure. For discrete systems, the field of autonomy has elevated the level of commanding by developing goal-directed systems, to which human operators specify a series of temporally extended goals to be accomplished, and the goal-directed systems automatically output the correct, executable command sequences. Increasingly, the control of autonomous systems involves performing actions with a mix of discrete and continuous effects. For example, a typical autonomous underwater vehicle (AUV) mission involves discrete actions, like get GPS and take sample, and continuous actions, like descend and ascend, which are influenced by the dynamical model of the vehicle. A hybrid planner generates a sequence of discrete and continuous actions that achieve the mission goals. In this thesis, I present a novel approach to solve the generative planning problem for temporally extended goals for hybrid systems, involving both continuous and discrete actions. The planner, Kongming, incorporates two innovations. First, it employs a compact representation of all hybrid plans, called a Hybrid Flow Graph, which combines the strengths of a Planning Graph for discrete actions and Flow Tubes for continuous actions. Second, it engages novel reformulation schemes to handle temporally flexible actions and temporally extended goals. I have successfully demonstrated controlling an AUV in the Atlantic ocean using mission scripts solely generated by Kongming. I have also empirically evaluated Kongming on various real-world scenarios in the underwater domain and the air vehicle domain, and found it successfully and efficiently generates valid and optimal plans. PhD thesis

Generalized Conflict Learning For Hybrid Discrete Linear Optimization

2005年5月20日 00:00:00 GMT

Generalized Conflict Learning For Hybrid Discrete Linear Optimization Li, Hui X. Conflict-directed search algorithms have formed the core of practical, model-based reasoning systems for the last three decades. In many of these applications there is a series of discrete constraint optimization problems and a conflict-directed search algorithm, which uses conflicts in the forward search step to focus search away from known infeasibilities and towards the optimal solution. In the arena of model-based autonomy, discrete systems, like deep space probes, have given way to more agile systems, such as coordinated vehicle control, which must robustly control their continuous dynamics. Controlling these systems requires optimizing over continuous, as well as discrete variables, using linear and non-linear as well as logical constraints. This thesis explores the development of algorithms for solving ybrid discrete/linear optimization problems that use conflicts in the forward search direction, generalizing from the conflict-directed search algorithms of based reasoning. We introduce a novel algorithm called Generalized Conflict-directed Branch and Bound (GCD-BB). GCD-BB extends traditional Branch and Bound (B&B), by first constructing conflicts from nodes of the search tree that are found to be infeasible or sub-optimal, and then by using these conflicts to guide the forward search away from known infeasible and sub-optimal states. We evaluate GCD-BB empirically on a range of test problems of coordinated air vehicle control. GCD-BB demonstrates a substantial improvement in performance compared to a traditional B&B algorithm, applied to either disjunctive linear programs or an equivalent binary integer program encoding. SM thesis

Computational Re-Photography

2010年4月07日 00:00:00 GMT

Computational Re-Photography Agarwala, Aseem; Bae, Soonmin; Durand, Fredo Rephotographers aim to recapture an existing photograph from the same viewpoint. A historical photograph paired with a well-aligned modern rephotograph can serve as a remarkable visualization of the passage of time. However, the task of rephotography is tedious and often imprecise, because reproducing the viewpoint of the original photograph is challenging. The rephotographer must disambiguate between the six degrees of freedom of 3D translation and rotation, and the confounding similarity between the effects of camera zoom and dolly. We present a real-time estimation and visualization technique for rephotography that helps users reach a desired viewpoint during capture. The input to our technique is a reference image taken from the desired viewpoint. The user moves through the scene with a camera and follows our visualization to reach the desired viewpoint. We employ computer vision techniques to compute the relative viewpoint difference. We guide 3D movement using two 2D arrows. We demonstrate the success of our technique by rephotographing historical images and conducting user studies.

Decoupled Sampling for Real-Time Graphics Pipelines

2010年3月29日 00:00:00 GMT

Decoupled Sampling for Real-Time Graphics Pipelines Ragan-Kelley, Jonathan; Doggett, Michael; Lehtinen, Jaakko; Chen, Jiawen; Durand, Fredo We propose decoupled sampling, an approach that decouples shading from visibility sampling in order to enable motion blur and depth-of-field at reduced cost. More generally, it enables extensions of modern real-time graphics pipelines that provide controllable shading rates to trade off quality for performance. It can be thought of as a generalization of GPU-style multisample antialiasing (MSAA) to support unpredictable shading rates, with arbitrary mappings from visibility to shading samples as introduced by motion blur, depth-of-field, and adaptive shading. It is inspired by the Reyes architecture in offline rendering, but targets real-time pipelines by driving shading from visibility samples as in GPUs, and removes the need for micropolygon dicing or rasterization. Decoupled Sampling works by defining a many-to-one hash from visibility to shading samples, and using a buffer to memoize shading samples and exploit reuse across visibility samples. We present extensions of two modern GPU pipelines to support decoupled sampling: a GPU-style sort-last fragment architecture, and a Larrabee-style sort-middle pipeline. We study the architectural implications and derive end-to-end performance estimates on real applications through an instrumented functional simulator. We demonstrate high-quality motion blur and depth-of-field, as well as variable and adaptive shading rates.

Relational Cloud: The Case for a Database Service

2010年3月14日 00:00:00 GMT

Relational Cloud: The Case for a Database Service Wu, Eugene; Madden, Samuel; Zhang, Yang; Jones, Evan; Curino, Carlo In this paper, we make the case for â databases as a serviceâ (DaaS), with two target scenarios in mind: (i) consolidation of data management functionality for large organizations and (ii) outsourcing data management to a cloud-based service provider for small/medium organizations. We analyze the many challenges to be faced, and discuss the design of a database service we are building, called Relational Cloud. The system has been designed from scratch and combines many recent advances and novel solutions. The prototype we present exploits multiple dedicated storage engines, provides high-availability via transparent replication, supports automatic workload partitioning and live data migration, and provides serializable distributed transactions. While the system is still under active development, we are able to present promising initial results that showcase the key features of our system. The tests are based on TPC benchmarks and real-world data from epinions.com, and show our partitioning, scalability and balancing capabilities.

CNS: a GPU-based framework for simulating cortically-organized networks

2010年2月26日 00:00:00 GMT

CNS: a GPU-based framework for simulating cortically-organized networks Poggio, Tomaso; Knoblich, Ulf; Mutch, Jim Computational models whose organization is inspired by the cortex are increasing in both number and popularity. Current instances of such models include convolutional networks, HMAX, Hierarchical Temporal Memory, and deep belief networks. These models present two practical challenges. First, they are computationally intensive. Second, while the operations performed by individual cells, or units, are typically simple, the code needed to keep track of network connectivity can quickly become complicated, leading to programs that are difficult to write and to modify. Massively parallel commodity computing hardware has recently become available in the form of general-purpose GPUs. This helps address the first problem but exacerbates the second. GPU programming adds an extra layer of difficulty, further discouraging exploration. To address these concerns, we have created a programming framework called CNS ('Cortical Network Simulator'). CNS models are automatically compiled and run on a GPU, typically 80-100x faster than on a single CPU, without the user having to learn any GPU programming. A novel scheme for the parametric specification of network connectivity allows the user to focus on writing just the code executed by a single cell. We hope that the ability to rapidly define and run cortically-inspired models will facilitate research in the cortical modeling community. CNS is available under the GNU General Public License.

Performance and error analysis of three part of speech taggers on health texts

2010年2月25日 00:00:00 GMT

Performance and error analysis of three part of speech taggers on health texts Zeng, Qing; Curtis, Dorothy Increasingly, natural language processing (NLP) techniques are being developed and utilized in a variety of biomedical domains. Part of speech tagging is a critical step in many NLP applications. Currently, we are developing a NLP tool for text simplification. As part of this effort, we set off to evaluate several part of speech (POS) taggers. We selected 120 sentences (2375 tokens) from a corpus of six types of diabetes-related health texts and asked human reviewers to tag each word in these sentences to create a "Gold Standard." We then tested each of the three POS taggers against the "Gold Standard." One tagger (dTagger) had been trained on health texts and the other two (MaxEnt and Curran & Clark) were trained on general news articles. We analyzed the errors and placed them into five categories: systematic, close, subtle, difficult source, and other. The three taggers have relatively similar rates of success: dTagger, MaxEnt, and Curran & Clark had 87%, 89% and 90% agreement with the gold standard, respectively. These rates of success are lower than published rates for these taggers. This is probably due to our testing them on a corpus that differs significantly from their training corpora. The taggers made different errors: the dTagger, which had been trained on a set of medical texts (MedPost), made fewer errors on medical terms than MaxEnt and Curran & Clark. The latter two taggers performed better on non-medical terms and we found the difference between their performance and that of dTagger was statistically significant. Our findings suggest that the three POS taggers have similar correct tagging rates, though they differ in the types of errors they make. For the task of text simplification, we are inclined to perform additional training of the Curran & Clark tagger with the Medpost corpus because both the fine grained tagging provided by this tool and the correct recognition of medical terms are equally important.

Efficient Cache Coherence on Manycore Optical Networks

2010年2月11日 00:00:00 GMT

Efficient Cache Coherence on Manycore Optical Networks Psota, James; Agarwal, Anant; Miller, Jason; Beckmann, Nathan; Kurian, George Ever since industry has turned to parallelism instead of frequency scaling to improve processor performance, multicore processors have continued to scale to larger and larger numbers of cores. Some believe that multicores will have 1000 cores or more by the middle of the next decade. However, their promise of increased performance will only be reached if their inherent scaling challenges are overcome. One such major scaling challenge is the viability of efficient cache coherence with large numbers of cores. Meanwhile, recent advances in nanophotonic device manufacturing are making CMOS-integrated optics a realityâ interconnect technology which can provide significantly more bandwidth at lower power than conventional electrical analogs. The contributions of this paper are two-fold. (1) It presents ATAC, a new manycore architecture that augments an electrical mesh network with an optical network that performs highly efficient broadcasts. (2) It introduces ACKwise, a novel directory-based cache coherence protocol that provides high performance and scalability on any large-scale manycore interconnection net- work with broadcast capability. Performance evaluation studies using analytical models show that (i) a 1024-core ATAC chip using ACKwise achieves a speedup of 3.9Ã compared to a similarly-sized pure electrical mesh manycore with a conventional limited directory protocol; (ii) the ATAC chip with ACKwise achieves a speedup of 1.35Ã compared to the electrical mesh chip with ACKwise; and (iii) a pure electrical mesh chip with ACKwise achieves a speedup of 2.9Ã over the same chip using a conventional limited directory protocol.

Core Count vs Cache Size for Manycore Architectures in the Cloud

2010年2月11日 00:00:00 GMT

Core Count vs Cache Size for Manycore Architectures in the Cloud Agarwal, Anant; Miller, Jason; Beckmann, Nathan; Wentzlaff, David The number of cores which fit on a single chip is growing at an exponential rate while off-chip main memory bandwidth is growing at a linear rate at best. This core count to off-chip bandwidth disparity causes per-core memory bandwidth to decrease as process technology advances. Continuing per-core off-chip bandwidth reduction will cause multicore and manycore chip architects to rethink the optimal grain size of a core and the on-chip cache configuration in order to save main memory bandwidth. This work introduces an analytic model to study the tradeoffs of utilizing increased chip area for larger caches versus more cores. We focus this study on constructing manycore architectures well suited for the emerging application space of cloud computing where many independent applications are consolidated onto a single chip. This cloud computing application mix favors small, power-efficient cores. The model is exhaustively evaluated across a large range of cache and core-count configurations utilizing SPEC Int 2000 miss rates and CACTI timing and area models to determine the optimal cache configurations and the number of cores across four process nodes. The model maximizes aggregate computational throughput and is applied to SRAM and logic process DRAM caches. As an example, our study demonstrates that the optimal manycore configuration in the 32nm node for a 200 mm^2 die uses on the order of 158 cores, with each core containing a 64KB L1I cache, a 16KB L1D cache, and a 1MB L2 embedded-DRAM cache. This study finds that the optimal cache size will continue to grow as process technology advances, but the tradeoff between more cores and larger caches is a complex tradeoff in the face of limited off-chip bandwidth and the non-linearities of cache miss rates and memory controller queuing delay.

Automatic Parallelization With Statistical Accuracy Bounds

2010年2月10日 00:00:00 GMT

Automatic Parallelization With Statistical Accuracy Bounds Kim, Deokhwan; Misailovic, Sasa; Rinard, Martin Traditional parallelizing compilers are designed to generate parallel programs that produce identical outputs as the original sequential program. The difficulty of performing the program analysis required to satisfy this goal and the restricted space of possible target parallel programs have both posed significant obstacles to the development of effective parallelizing compilers. The QuickStep compiler is instead designed to generate parallel programs that satisfy statistical accuracy guarantees. The freedom to generate parallel programs whose output may differ (within statistical accuracy bounds) from the output of the sequential program enables a dramatic simplification of the compiler and a significant expansion in the range of parallel programs that it can legally generate. QuickStep exploits this flexibility to take a fundamentally different approach from traditional parallelizing compilers. It applies a collection of transformations (loop parallelization, loop scheduling, synchronization introduction, and replication introduction) to generate a search space of parallel versions of the original sequential program. It then searches this space (prioritizing the parallelization of the most time-consuming loops in the application) to find a final parallelization that exhibits good parallel performance and satisfies the statistical accuracy guarantee. At each step in the search it performs a sequence of trial runs on representative inputs to examine the performance, accuracy, and memory accessing characteristics of the current generated parallel program. An analysis of these characteristics guides the steps the compiler takes as it explores the search space of parallel programs. Results from our benchmark set of applications show that QuickStep can automatically generate parallel programs with good performance and statistically accurate outputs. For two of the applications, the parallelization introduces noise into the output, but the noise remains within acceptable statistical bounds. The simplicity of the compilation strategy and the performance and statistical acceptability of the generated parallel programs demonstrate the advantages of the QuickStep approach.

The Cost of Global Broadcast Using Abstract MAC Layers

2010年2月09日 00:00:00 GMT

The Cost of Global Broadcast Using Abstract MAC Layers Lynch, Nancy; Kuhn, Fabian; Kowalski, Dariusz; Khabbazian, Majid We analyze greedy algorithms for broadcasting messages throughout a multi-hop wireless network, using a slot-based model that includes message collisions without collision detection. Our algorithms are split formally into two pieces: a high-level piece for broadcast and a low-level piece for contention management. We accomplish the split using abstract versions of the MAC layer to encapsulate the contention management. We use two different abstract MAC layers: a basic non-probabilistic one, which our contention management algorithm implements with high probability, and a probabilistic one, which our contention management algorithm implements precisely. Using this approach, we obtain the following complexity bounds: Single-message broadcast, using the basic abstract MAC layer, takes time O(D log(n/epsilon) log(Delta)) to deliver the message everywhere with probability 1 - epsilon, where D is the network diameter, n is the number of nodes, and Delta is the maximum node degree. Single-message broadcast, using the probabilistic abstract MAC layer, takes time only O((D + log(n/epsilon)) log(Delta)). For multi-message broadcast, the bounds are O((D + k' Delta) log(n/epsilon) log(Delta)) using the basic layer and O((D + k' Delta log(n/epsilon)) log(Delta)) using the probabilistic layer,for the time to deliver a single message everywhere in the presence of at most k' concurrent messages.

An Operating System for Multicore and Clouds: Mechanisms and Implementation

2010年2月08日 00:00:00 GMT

An Operating System for Multicore and Clouds: Mechanisms and Implementation Modzelewski, Kevin; Miller, Jason; Belay, Adam; Beckmann, Nathan; Gruenwald, Charles, III; Wentzlaff, David; Youseff, Lamia; Agarwal, Anant Cloud computers and multicore processors are two emerging classes of computational hardware that have the potential to provide unprecedented compute capacity to the average user. In order for the user to effectively harness all of this computational power, operating systems (OSes) for these new hardware platforms are needed. Existing multicore operating systems do not scale to large numbers of cores, and do not support clouds. Consequently, current-day cloud systems push much complexity onto the user, requiring the user to manage individual Virtual Machines (VMs) and deal with many system-level concerns. In this work we describe the mechanisms and implementation of a factored operating system named fos. fos is a single system image operating system across both multicore and Infrastructure as a Service (IaaS) cloud systems. fos tackles OS scalability challenges by factoring the OS into its component system services. Each system service is further factored into a collection of Internet-inspired servers which communicate via messaging. Although designed in a manner similar to distributed Internet services, OS services instead provide traditional kernel services such as file systems, scheduling, memory management,and access to hardware. fos also implements new classes of OS services like fault tolerance and demand elasticity. In this work, we describe our working fos implementation, and provide early performance measurements of fos for both intra-machine and inter-machine operations.

Submodular Secretary Problem and Extensions

2010年2月01日 00:00:00 GMT

Submodular Secretary Problem and Extensions Zadimoghaddam, Morteza; Hajiaghayi, MohammadTaghi; Bateni, MohammadHossein Online auction is an essence of many modern markets, particularly networked markets, in which information about goods, agents, and outcomes is revealed over a period of time, and the agents must make irrevocable decisions without knowing future information. Optimal stopping theory, especially the classic "secretary problem", is a powerful tool for analyzing such online scenarios which generally require optimizing an objective function over the input. The secretary problem and its generalization the "multiple-choice secretary problem" were under a thorough study in the literature. In this paper, we consider a very general setting of the latter problem called the "submodular secretary problem", in which the goal is to select k secretaries so as to maximize the expectation of a (not necessarily monotone) submodular function which defines efficiency of the selected secretarial group based on their overlapping skills. We present the first constant-competitive algorithm for this case. In a more general setting in which selected secretaries should form an independent (feasible) set in each of l given matroids as well, we obtain an O(l log^2 r)-competitive algorithm generalizing several previous results, where r is the maximum rank of the matroids. Another generalization is to consider l knapsack constraints instead of the matroid constraints, for which we present an O(l)-competitive algorithm. In a sharp contrast, we show for a more general setting of "subadditive secretary problem, there is no o~(sqrt(n))-competitive algorithm and thus submodular functions are the most general functions to consider for constant competitiveness in our setting. We complement this result by giving a matching O(sqrt(n))-competitive algorithm for the subadditive case. At the end, we consider some special cases of our general setting as well.

SWIFT: A Narrowband-Friendly Cognitive Wideband Network

2008年8月17日 00:00:00 GMT

SWIFT: A Narrowband-Friendly Cognitive Wideband Network Sodini, Charles; Edalat, Farinaz; Katabi, Dina; Kushman, Nate; Rahul, Hariharan Wideband technologies in the unlicensed spectrum can satisfy the ever-increasing demands for wireless bandwidth created by emerging rich media applications. The key challenge for such systems, however, is to allow narrowband technologies that share these bands (say, 802.11 a/b/g/n, Zigbee) to achieve their normal performance, without compromising the throughput or range of the wideband network.This paper presents SWIFT, the first system where high-throughput wideband nodes are shown in a working deployment to coexist with unknown narrowband devices, while forming a network of their own. Prior work avoids narrowband devices by operating below the noise level and limiting itself to a single contiguous unused band. While this achieves coexistence, it sacrifices the throughput and operating distance of the wideband device. In contrast, SWIFT creates high throughput wireless links by weaving together non-contiguous unused frequency bands that change as narrowband devices enter or leave the environment. This design principle of cognitive aggregation allows SWIFT to achieve coexistence, while operating at normal power, and thereby obtaining higher throughput and greater operating range. We implement SWIFT on a wideband hardware platform, and evaluate it in the presence of 802.11 devices. In comparison to a baseline that coexists with narrowband devices by operating below their noise level, SWIFT is equally narrowband-friendly but achieves 3.6x-10.5x higher throughput and 6x greater range.

Selective Vectorization for Short-Vector Instructions

2009年12月18日 00:00:00 GMT

Selective Vectorization for Short-Vector Instructions Amarasinghe, Saman; Rabbah, Rodric; Larsen, Samuel Multimedia extensions are nearly ubiquitous in today's general-purpose processors. These extensions consist primarily of a set of short-vector instructions that apply the same opcode to a vector of operands. Vector instructions introduce a data-parallel component to processors that exploit instruction-level parallelism, and present an opportunity for increased performance. In fact, ignoring a processor's vector opcodes can leave a significant portion of the available resources unused. In order for software developers to find short-vector instructions generally useful, however, the compiler must target these extensions with complete transparency and consistent performance. This paper describes selective vectorization, a technique for balancing computation across a processor's scalar and vector units. Current approaches for targeting short-vector instructions directly adopt vectorizing technology first developed for supercomputers. Traditional vectorization, however, can lead to a performance degradation since it fails to account for a processor's scalar resources. We formulate selective vectorization in the context of software pipelining. Our approach creates software pipelines with shorter initiation intervals, and therefore, higher performance. A key aspect of selective vectorization is its ability to manage transfer of operands between vector and scalar instructions. Even when operand transfer is expensive, our technique is sufficiently sophisticated to achieve significant performance gains. We evaluate selective vectorization on a set of SPEC FP benchmarks. On a realistic VLIW processor model, the approach achieves whole-program speedups of up to 1.35x over existing approaches. For individual loops, it provides speedups of up to 1.75x.

Advancing Computational Models of Narrative

2009年12月17日 00:00:00 GMT

Advancing Computational Models of Narrative Richards, Whitman; Winston, Patrick Henry; Finlayson, Mark Alan Report of a Workshop held at the Wylie Center, Beverly, MA, Oct 8-10 2009

The Video Mesh: A Data Structure for Image-based Video Editing

2009年12月16日 00:00:00 GMT

The Video Mesh: A Data Structure for Image-based Video Editing Durand, Fredo; Cohen, Michael; Chen, Jiawen; Paris, Sylvain; Wang, Jue; Matusik, Wojciech This paper introduces the video mesh, a data structure for representing video as 2.5D "paper cutouts." The video mesh allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. The video mesh sparsely encodes optical flow as well as depth, and handles occlusion using local layering and alpha mattes. Motion is described by a sparse set of points tracked over time. Each point also stores a depth value. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. The user rotoscopes occluding contours and we introduce an algorithm to cut the video mesh along them. Object boundaries are refined with perpixel alpha values. The video mesh is at its core a set of texture mapped triangles, we leverage graphics hardware to enable interactive editing and rendering of a variety of effects. We demonstrate the effectiveness of our representation with a number of special effects including 3D viewpoint changes, object insertion, and depth-of-field manipulation.

Perfect and General Virtual Implementation For Perfectly Informed Players

2009年12月04日 00:00:00 GMT

Perfect and General Virtual Implementation For Perfectly Informed Players Micali, Silvio; Chen, Jing We show that, when the players are perfectly informed about each other, essentially all social-choice functions can be rationally robustly implemented via an extensive-form public-action mechanism that (1) is perfectly robust against collusion, (2) requires only a linear number of computation steps and communication bits, and (3) preserves the privacy of the players' types to a very high extent.

Sufficient Conditions for Uniform Stability of Regularization Algorithms

2009年12月01日 00:00:00 GMT

Sufficient Conditions for Uniform Stability of Regularization Algorithms Poggio, Tomaso; Rosasco, Lorenzo; Wibisono, Andre In this paper, we study the stability and generalization properties of penalized empirical-risk minimization algorithms. We propose a set of properties of the penalty term that is sufficient to ensure uniform ?-stability: we show that if the penalty function satisfies a suitable convexity property, then the induced regularization algorithm is uniformly ?-stable. In particular, our results imply that regularization algorithms with penalty functions which are strongly convex on bounded domains are ?-stable. In view of the results in [3], uniform stability implies generalization, and moreover, consistency results can be easily obtained. We apply our results to show that â p regularization for 1 < p <= 2 and elastic-net regularization are uniformly ?-stable, and therefore generalize.

A Unified Operating System for Clouds and Manycore: fos

2009年11月20日 00:00:00 GMT

A Unified Operating System for Clouds and Manycore: fos Modzelewski, Kevin; Miller, Jason; Belay, Adam; Beckmann, Nathan; Gruenwald, Charles, III; Wentzlaff, David; Youseff, Lamia; Agarwal, Anant Single chip processors with thousands of cores will be available in the next ten years and clouds of multicore processors afford the operating system designer thousands of cores today. Constructing operating systems for manycore and cloud systems face similar challenges. This work identifies these shared challenges and introduces our solution: a factored operating system (fos) designed to meet the scalability, faultiness, variability of demand, and programming challenges of OSâ s for single-chip thousand-core manycore systems as well as current day cloud computers. Current monolithic operating systems are not well suited for manycores and clouds as they have taken an evolutionary approach to scaling such as adding fine grain locks and redesigning subsystems, however these approaches do not increase scalability quickly enough. fos addresses the OS scalability challenge by using a message passing design and is composed out of a collection of Internet inspired servers. Each operating system service is factored into a set of communicating servers which in aggregate implement a system service. These servers are designed much in the way that distributed Internet services are designed, but provide traditional kernel services instead of Internet services. Also, fos embraces the elasticity of cloud and manycore platforms by adapting resource utilization to match demand. fos facilitates writing applications across the cloud by providing a single system image across both future 1000+ core manycores and current day Infrastructure as a Service cloud computers. In contrast, current cloud environments do not provide a single system image and introduce complexity for the user by requiring different programming models for intra- vs inter-machine communication, and by requiring the use of non-OS standard management tools.

Distributed Computation in Dynamic Networks

2009年11月10日 00:00:00 GMT

Distributed Computation in Dynamic Networks Oshman, Rotem; Lynch, Nancy; Kuhn, Fabian In this report we investigate distributed computation in dynamic networks in which the network topology changes from round to round. We consider a worst-case model in which the communication links for each round are chosen by an adversary, and nodes do not know who their neighbors for the current round are before they broadcast their messages. The model is intended to capture mobile networks and wireless networks, in which mobility and interference render communication unpredictable. The model allows the study of the fundamental computation power of dynamic networks. In particular, it captures mobile networks and wireless networks, in which mobility and interference render communication unpredictable. In contrast to much of the existing work on dynamic networks, we do not assume that the network eventually stops changing; we require correctness and termination even in networks that change continually. We introduce a stability property called T-interval connectivity (for T >= 1), which stipulates that for every T consecutive rounds there exists a stable connected spanning subgraph. For T = 1 this means that the graph is connected in every round, but changes arbitrarily between rounds. Algorithms for the dynamic graph model must cope with these unceasing changes. We show that in 1-interval connected graphs it is possible for nodes to determine the size of the network and compute any computable function of their initial inputs in O(n^2) rounds using messages of size O(log n + d), where d is the size of the input to a single node. Further, if the graph is T-interval connected for T > 1, the computation can be sped up by a factor of T, and any function can be computed in O(n + n^2 / T) rounds using messages of size O(log n + d). We also give two lower bounds on the gossip problem, which requires the nodes to disseminate k pieces of information to all the nodes in the network. We show an Omega(n log k) bound on gossip in 1-interval connected graphs against centralized algorithms, and an Omega(n + nk / T) bound on exchanging k pieces of information in T-interval connected graphs for a restricted class of randomized distributed algorithms. The T-interval connected dynamic graph model is a novel model, which we believe opens new avenues for research in the theory of distributed computing in wireless, mobile and dynamic networks.

Rational Robustness for Mechanism Design

2009年11月10日 00:00:00 GMT

Rational Robustness for Mechanism Design Micali, Silvio; Chen, Jing Theory of Computation The currently prevailing equilibrium-based approach to mechanism design suffers from a plurality of fundamental problems, and new conceptual frameworks are needed to solve or sufficiently alleviate them. In this paper, we put forward rational robustness, a new solution concept/implementation notion that is not equilibrium-based; prove its fundamental structural theorems; and compare it with prior notions. Our notion of implementation is specifically built so as to be robust against the problem of equilibrium selection. We prove it robust against other fundamental problems as well in different papers. first draft

Graphite: A Distributed Parallel Simulator for Multicores

2009年11月09日 00:00:00 GMT

Graphite: A Distributed Parallel Simulator for Multicores Beckmann, Nathan; Eastep, Jonathan; Gruenwald, Charles, III; Kurian, George; Kasture, Harshad; Miller, Jason E.; Celio, Christopher; Agarwal, Anant This paper introduces the open-source Graphite distributed parallel multicore simulator infrastructure. Graphite is designed from the ground up for exploration of future multicore processors containing dozens, hundreds, or even thousands of cores. It provides high performance for fast design space exploration and software development for future processors. Several techniques are used to achieve this performance including: direct execution, multi-machine distribution, analytical modeling, and lax synchronization. Graphite is capable of accelerating simulations by leveraging several machines. It can distribute simulation of an off-the-shelf threaded application across a cluster of commodity Linux machines with no modification to the source code. It does this by providing a single, shared address space and consistent single-process image across machines. Graphite is designed to be a simulation framework, allowing different component models to be easily replaced to either model different architectures or tradeoff accuracy for performance. We evaluate Graphite from a number of perspectives and demonstrate that it can simulate target architectures containing over 1000 cores on ten 8-core servers. Performance scales well as more machines are added with near linear speedup in many cases. Simulation slowdown is as low as 41x versus native execution for some applications. The Graphite infrastructure and existing models will be released as open-source software to allow the community to simulate their own architectures and extend and improve the framework.

Smartlocks: Self-Aware Synchronization through Lock Acquisition Scheduling

2009年11月09日 00:00:00 GMT

Smartlocks: Self-Aware Synchronization through Lock Acquisition Scheduling Agarwal, Anant; Santambrogio, Marco D.; Wingate, David; Eastep, Jonathan As multicore processors become increasingly prevalent, system complexity is skyrocketing. The advent of the asymmetric multicore compounds this -- it is no longer practical for an average programmer to balance the system constraints associated with today's multicores and worry about new problems like asymmetric partitioning and thread interference. Adaptive, or self-aware, computing has been proposed as one method to help application and system programmers confront this complexity. These systems take some of the burden off of programmers by monitoring themselves and optimizing or adapting to meet their goals. This paper introduces an open-source self-aware synchronization library for multicores and asymmetric multicores called Smartlocks. Smartlocks is a spin-lock library that adapts its internal implementation during execution using heuristics and machine learning to optimize toward a user-defined goal, which may relate to performance, power, or other problem-specific criteria. Smartlocks builds upon adaptation techniques from prior work like reactive locks, but introduces a novel form of adaptation designed for asymmetric multicores that we term lock acquisition scheduling. Lock acquisition scheduling is optimizing which waiter will get the lock next for the best long-term effect when multiple threads (or processes) are spinning for a lock. Our results demonstrate empirically that lock scheduling is important for asymmetric multicores and that Smartlocks significantly outperform conventional and reactive locks for asymmetries like dynamic variations in processor clock frequencies caused by thermal throttling events.

Automated home-cage behavioral phenotyping of mice

2009年10月26日 00:00:00 GMT

Automated home-cage behavioral phenotyping of mice Yu, Xinlin; Steele, Andrew D.; Khilnani, Vinita; Garrote, Estibaliz; Jhuang, Hueihan; Serre, Thomas; Poggio, Tomaso We describe a trainable computer vision system enabling the automated analysis of complex mouse behaviors. We provide software and a very large manually annotated video database used for training and testing the system. Our system outperforms leading commercial software and performs on par with human scoring, as measured from the ground-truth manual annotations of thousands of clips of freely behaving animals. We show that the home-cage behavior profiles provided by the system is sufficient to accurately predict the strain identity of individual animals in the case of two standard inbred and two non-standard mouse strains. Our software should complement existing sensor-based automated approaches and help develop an adaptable, comprehensive, high-throughput, fine-grained, automated analysis of rodent behavior.

Co-Clustering with Generative Models

2009年11月03日 00:00:00 GMT

Co-Clustering with Generative Models Golland, Polina; Lashkari, Danial In this paper, we present a generative model for co-clustering and develop algorithms based on the mean field approximation for the corresponding modeling problem. These algorithms can be viewed as generalizations of the traditional model-based clustering; they extend hard co-clustering algorithms such as Bregman co-clustering to include soft assignments. We show empirically that these model-based algorithms offer better performance than their hard-assignment counterparts, especially with increasing problem complexity.

Propagation Networks: A Flexible and Expressive Substrate for Computation

2009年11月03日 00:00:00 GMT

Propagation Networks: A Flexible and Expressive Substrate for Computation Radul, Alexey I propose a shift in the foundations of computation. Practically all ideas of general-purpose computation today are founded either on execution of sequences of atomic instructions, i.e., assembly languages, or on evaluation of tree-structured expressions, i.e., most higher level programming languages. Both have served us well in the past, but it is increasingly clear that we need something more. I suggest that we can build general-purpose computation on propagation of information through networks of stateful cells interconnected with stateless autonomous asynchronous computing elements. Various forms of this general idea have been used with great success for various special purposes; perhaps the most immediate example is constraint propagation in constraint satisfaction systems. These special-purpose systems, however, are all complex and all different, and neither compose well, nor interoperate well, nor generalize well. A foundational layer is missing. The key insight in this work is that a cell should not be seen as storing a value, but as accumulating information about a value. The cells should never forget information -- such monotonicity prevents race conditions in the behavior of the network. Monotonicity of information need not be a severe restriction: for example, carrying reasons for believing each thing makes it possible to explore but thenpossibly reject tentative hypotheses, thus appearing to undo something, while maintaining monotonicity. Accumulating information is a broad enough design principle to encompass arbitrary computation. The object of this dissertation is therefore to architect a general-purpose computing system based on propagation networks; to subsume expression evaluation under propagation just as instruction execution is subsumed under expression evaluation; to demonstrate that a general-purpose propagation system can recover all the benefits that have been derived from special-purpose propagation systems, allow them to compose andinteroperate, and offer further expressive power beyond what we have known in the past; and finally to contemplate the lessons that such a fundamental shift can teach us about the deep nature of computation. PhD thesis

Shape from Sheen

2009年10月22日 00:00:00 GMT

Shape from Sheen Adelson, Edward H.; Torralba, Antonio; Fleming, Roland W.

Iterative Projection Methods for Structured Sparsity Regularization

2009年10月14日 00:00:00 GMT

Iterative Projection Methods for Structured Sparsity Regularization Rosasco, Lorenzo; Verri, Alessandro; Santoro, Matteo; Mosci, Sofia; Villa, Silvia In this paper we propose a general framework to characterize and solve the optimization problems underlying a large class of sparsity based regularization algorithms. More precisely, we study the minimization of learning functionals that are sums of a differentiable data term and a convex non differentiable penalty. These latter penalties have recently become popular in machine learning since they allow to enforce various kinds of sparsity properties in the solution. Leveraging on the theory of Fenchel duality and subdifferential calculus, we derive explicit optimality conditions for the regularized solution and propose a general iterative projection algorithm whose convergence to the optimal solution can be proved. The generality of the framework is illustrated, considering several examples of regularization schemes, including l1 regularization (and several variants), multiple kernel learning and multi-task learning. Finally, some features of the proposed framework are empirically studied.

Understanding and Supporting Directed Content Sharing on the Web

2009年10月07日 00:00:00 GMT

Understanding and Supporting Directed Content Sharing on the Web Miller, Rob; Karger, David; Marcus, Adam; Bernstein, Michael To find interesting, personally relevant web content, we often rely on friends and colleagues to pass links along as they encounter them. In this paper, we study and augment link-sharing via e-mail, the most popular means of sharing web content today. Armed with survey data indicating that active sharers of novel web content are often those that actively seek it out, we present FeedMe, a plug-in for Google Reader that makes directed sharing of content a more salient part of the user experience. Our survey research indicates that sharing is moderated by concern about relevancy to the recipient, a desire to send only novel content to the recipient, and the effort required to share. FeedMe allays these concerns by recommending friends who may be interested in seeing the content, providing information on what the recipient has seen and how many emails they have received recently, and giving recipients the opportunity to provide lightweight feedback when they appreciate shared content. FeedMe introduces a novel design space for mixed-initiative social recommenders: friends who know the user voluntarily vet the material on the userâ s behalf. We present a two week field experiment (N=60) demonstrating that FeedMeâ s recommendations and social awareness features made it easier and more enjoyable to share content that recipients appreciated and would not have found otherwise.

Notes on the Shannon Entropy of the Neural Response

2009年10月09日 00:00:00 GMT

Notes on the Shannon Entropy of the Neural Response Shakhnarovich, Greg; Bouvrie, Jake; Rosasco, Lorenzo; Smale, Steve In these notes we focus on the concept of Shannon entropy in an attempt to provide a systematic way of assessing the discrimination properties of the neural response, and quantifying the role played by the number of layers and the number of templates.

A Bayesian inference theory of attention: neuroscience and algorithms

2009年10月03日 00:00:00 GMT

A Bayesian inference theory of attention: neuroscience and algorithms Chikkerur, Sharat; Serre, Thomas; Poggio, Tomaso The past four decades of research in visual neuroscience has generated a large and disparate body of literature on the role of attention [Itti et al., 2005]. Although several models have been developed to describe specific properties of attention, a theoretical framework that explains the computational role of attention and is consistent with all known effects is still needed. Recently, several authors have suggested that visual perception can be interpreted as a Bayesian inference process [Rao et al., 2002, Knill and Richards, 1996, Lee and Mumford, 2003]. Within this framework, topdown priors via cortical feedback help disambiguate noisy bottom-up sensory input signals. Building on earlier work by Rao [2005], we show that this Bayesian inference proposal can be extended to explain the role and predict the main properties of attention: namely to facilitate the recognition of objects in clutter. Visual recognition proceeds by estimating the posterior probabilities for objects and their locations within an image via an exchange of messages between ventral and parietal areas of the visual cortex. Within this framework, spatial attention is used to reduce the uncertainty in feature information; feature-based attention is used to reduce the uncertainty in location information. In conjunction, they are used to recognize objects in clutter. Here, we find that several key attentional phenomena such such as pop-out, multiplicative modulation and change in contrast response emerge naturally as a property of the network. We explain the idea in three stages. We start with developing a simplified model of attention in the brain identifying the primary areas involved and their interconnections. Secondly, we propose a Bayesian network where each node has direct neural correlates within our simplified biological model. Finally, we elucidate the properties of the resulting model, showing that the predictions are consistent with physiological and behavioral evidence.

Attentive processing improves object recognition

2009年10月02日 00:00:00 GMT

Attentive processing improves object recognition Chikkerur, Sharat; Poggio, Tomaso; Serre, Thomas The human visual system can recognize several thousand object categories irrespective of their position and size. This combination of selectivity and invariance is built up gradually across several stages of visual processing. However, the recognition of multiple objects in cluttered visual scenes presents a difficult problem for human as well as machine vision systems. The human visual system has evolved to perform two stages of visual processing: a pre-attentive parallel processing stage, in which the entire visual field is processed at once and a slow serial attentive processing stage, in which aregion of interest in an input image is selected for "specialized" analysis by an attentional spotlight. We argue that this strategy evolved to overcome the limitation of purely feed forward processing in the presence of clutter and crowding. Using a Bayesian model of attention along with a hierarchical model of feed forward recognition on a data set of real world images, we show that this two stage attentive processing can improve recognition in cluttered and crowded conditions.

Efficient POMDP Forward Search by Predicting the Posterior Belief Distribution

2009年9月23日 00:00:00 GMT

Efficient POMDP Forward Search by Predicting the Posterior Belief Distribution Roy, Nicholas; He, Ruijie Online, forward-search techniques have demonstrated promising results for solving problems in partially observable environments. These techniques depend on the ability to efficiently search and evaluate the set of beliefs reachable from the current belief. However, enumerating or sampling action-observation sequences to compute the reachable beliefs is computationally demanding; coupled with the need to satisfy real-time constraints, existing online solvers can only search to a limited depth. In this paper, we propose that policies can be generated directly from the distribution of the agent's posterior belief. When the underlying state distribution is Gaussian, and the observation function is an exponential family distribution, we can calculate this distribution of beliefs without enumerating the possible observations. This property not only enables us to plan in problems with large observation spaces, but also allows us to search deeper by considering policies composed of multi-step action sequences. We present the Posterior Belief Distribution (PBD) algorithm, an efficient forward-search POMDP planner for continuous domains, demonstrating that better policies are generated when we can perform deeper forward search.

Whanaungatanga: Sybil-proof routing with social networks

2009年9月24日 00:00:00 GMT

Whanaungatanga: Sybil-proof routing with social networks Lesniewski-Laas, Chris; Kaashoek, M. Frans Decentralized systems, such as distributed hash tables, are subject to the Sybil attack, in which an adversary creates many false identities to increase its influence. This paper proposes a routing protocol for a distributed hash table that is strongly resistant to the Sybil attack. This is the first solution to this problem with sublinear run time and space usage. The protocol uses the social connections between users to build routing tables that enable Sybil-resistant distributed hash table lookups. With a social network of N well-connected honest nodes, the protocol can tolerate up to O(N/log N) "attack edges" (social links from honest users to phony identities). This means that an adversary has to fool a large fraction of the honest users before any lookups will fail. The protocol builds routing tables that contain O(N log^(3/2) N) entries per node. Lookups take O(1) time. Simulation results, using social network graphs from LiveJournal, Flickr, and YouTube, confirm the analytical results.

Finding aircraft collision-avoidance strategies using policy search methods

2009年9月12日 00:00:00 GMT

Finding aircraft collision-avoidance strategies using policy search methods Kaelbling, Leslie Pack; Lozano-Perez, Tomas A progress report describing the application of policy gradient and policy search by dynamic programming methods to an aircraft collision avoidance problem inspired by the requirements of next-generation TCAS.

Code for LOLCAT Method (Variant of Gillespie Algorithm)

2009年9月04日 00:00:00 GMT

Code for LOLCAT Method (Variant of Gillespie Algorithm) Beal, Jacob; Indurkhya, Sagar This code and data is publicly listed code for the LOLCAT Method developed by Sagar Indurkhya and Jacob Beal, in the paper: "Reaction factoring and bipartite update graphs accelerate the Gillespie algorithm for large-scale biochemical systems."

Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures

2009年9月03日 00:00:00 GMT

Using Code Perforation to Improve Performance, Reduce Energy Consumption, and Respond to Failures Hoffmann, Henry; Misailovic, Sasa; Sidiroglou, Stelios; Agarwal, Anant; Rinard, Martin Many modern computations (such as video and audio encoders, Monte Carlo simulations, and machine learning algorithms) are designed to trade off accuracy in return for increased performance. To date, such computations typically use ad-hoc, domain-specific techniques developed specifically for the computation at hand. We present a new general technique, code perforation, for automatically augmenting existing computations with the capability of trading off accuracy in return for performance. In contrast to existing approaches, which typically require the manual development of new algorithms, our implemented SpeedPress compiler can automatically apply code perforation to existing computations with no developer intervention whatsoever. The result is a transformed computation that can respond almost immediately to a range of increased performancedemands while keeping any resulting output distortion within acceptable user-defined bounds. We have used SpeedPress to automatically apply code perforation to applications from the PARSEC benchmark suite. The results show that the transformed applications can run as much as two to three times faster than the original applications while distorting the output by less than 10%. Because the transformed applications can operate successfully at many points in the performance/accuracy tradeoff space, they can (dynamically and on demand) navigate the tradeoff space to either maximize performance subject to a given accuracy constraint, or maximize accuracy subject to a given performance constraint. We also demonstrate the SpeedGuard runtime system which uses code perforation to enable applications to automatically adapt to challenging execution environments such as multicore machines that suffer core failures or machines that dynamically adjust the clock speed to reduce power consumption or to protect the machine from overheating.

Lightweight Communications and Marshalling for Low-Latency Interprocess Communication

2009年9月02日 00:00:00 GMT

Lightweight Communications and Marshalling for Low-Latency Interprocess Communication Moore, David; Olson, Edwin; Huang, Albert We describe the Lightweight Communications and Marshalling (LCM) library for message passing and data marshalling. The primary goal of LCM is to simplify the development of low-latency message passing systems, targeted at real-time robotics applications. LCM is comprised of several components: a data type specification language, a message passing system, logging/playback tools, and real-time analysis tools. LCM provides a platform- and language-independent type specification language. These specifications can be compiled into platform and language specific implementations, eliminating the need for users to implement marshalling code while guaranteeing run-time type safety. Messages can be transmitted between different processes using LCM's message-passing system, which implements a publish/subscribe model. LCM's implementation is notable in providing low-latency messaging and eliminating the need for a central communications "hub". This architecture makes it easy to mix simulated, recorded, and live data sources. A number of logging, playback, and traffic inspection tools simplify common development and debugging tasks. LCM is targeted at robotics and other real-time systems where low latency is critical; its messaging model permits dropping messages in order to minimize the latency of new messages. In this paper, we explain LCM's design, evaluate its performance, and describe its application to a number of autonomous land, underwater, and aerial robots.

Information Flow for Secure Distributed Applications

2009年8月27日 00:00:00 GMT

Information Flow for Secure Distributed Applications Cheng, Winnie Wing-Yee Private and confidential information is increasingly stored online and increasingly being exposed due to human errors as well as malicious attacks. Information leaks threaten confidentiality, lead to lawsuits, damage enterprise reputations, and cost billion of dollars. While distributed computing architectures provide data and service integration, they also create information flow control problems due to the interaction complexity among service providers. A main problem is the lack of an appropriate programming model to capture expected information flow behaviors in these large distributed software infrastructures. This research tackles this problem by proposing a programming methodology and enforcement platform for application developers to protect and share their sensitive data. We introduce Aeolus, a new platform intended to make it easier to build distributed applications that avoid the unauthorized release of information. The Aeolus security model is based on information flow control but differs from previous work in ways that we believe make it easier to use and understand. In addition, Aeolus provides a number of new mechanisms (anonymous closures, compound tags, boxes, and shared volatile state) to ease the job of writing applications. This thesis provides examples to show how Aeolus features support secure distributed applications. It describes the system design issues and solutions in designing a prototype implementation and presents performance results that show our platform has low overhead. PhD thesis

AvatarSAT: An Auto-tuning Boolean SAT Solver

2009年8月26日 00:00:00 GMT

AvatarSAT: An Auto-tuning Boolean SAT Solver Ganesh, Vijay; Singh, Rishabh; Near, Joseph P.; Rinard, Martin We present AvatarSAT, a SAT solver that uses machine-learning classifiers to automatically tune the heuristics of an off-the-shelf SAT solver on a per-instance basis. The classifiers use features of both the input and conflict clauses to select parameter settings for the solver's tunable heuristics. On a randomly selected set of SAT problems chosen from the 2007 and 2008 SAT competitions, AvatarSAT is, on average, over two times faster than MiniSAT based on the geometric mean speedup measure and 50% faster based on the arithmeticmean speedup measure. Moreover, AvatarSAT is hundreds to thousands of times faster than MiniSAT on many hard SAT instances and is never more than twenty times slower than MiniSAT on any SAT instance.

Detecting Hazardous Intensive Care Patient Episodes Using Real-time Mortality Models

2009年8月26日 00:00:00 GMT

Detecting Hazardous Intensive Care Patient Episodes Using Real-time Mortality Models Hug, Caleb The modern intensive care unit (ICU) has become a complex, expensive, data-intensive environment. Caregivers maintain an overall assessment of their patients based on important observations and trends. If an advanced monitoring system could also reliably provide a systemic interpretation of a patient's observations it could help caregivers interpret these data more rapidly and perhaps more accurately. In this thesis I use retrospective analysis of mixed medical/surgical intensive care patients to develop predictive models. Logistic regression is applied to 7048 development patients with several hundred candidate variables. These candidate variables range from simple vitals to long term trends and baseline deviations. Final models are selected by backward elimination on top cross-validated variables and validated on 3018 additional patients. The real-time acuity score (RAS) that I develop demonstrates strong discrimination ability for patient mortality, with an ROC area (AUC) of 0.880. The final model includes a number of variables known to be associated with mortality, but also computationally intensive variables absent in other severity scores. In addition to RAS, I also develop secondary outcome models that perform well at predicting pressor weaning (AUC=0.825), intraaortic balloon pump removal (AUC=0.816), the onset of septic shock (AUC=0.843), and acute kidney injury (AUC=0.742). Real-time mortality prediction is a feasible way to provide continuous risk assessment for ICU patients. RAS offers similar discrimination ability when compared to models computed once per day, based on aggregate data over that day. Moreover, RAS mortality predictions are better at discrimination than a customized SAPS II score (Day 3 AUC=0.878 vs AUC=0.849, p < 0.05). The secondary outcome models also provide interesting insights into patient responses to care and patient risk profiles. While models trained for specifically recognizing secondary outcomes consistently outperform the RAS model at their specific tasks, RAS provides useful baseline risk estimates throughout these events and in some cases offers a notable level of predictive utility. PhD thesis

Extending a MOOS-IvP Autonomy System and Users Guide to the IvPBuild Toolbox

2009年8月20日 00:00:00 GMT

Extending a MOOS-IvP Autonomy System and Users Guide to the IvPBuild Toolbox Benjamin, Michael R.; Newman, Paul M.; Schmidt, Henrik; Leonard, John J. This document describes how to extend the suite of MOOS applications and IvP Helm behaviors distributed with the MOOS-IvP software bundle from www.moos-ivp.org. It covers (a) a straw-man repository with a place-holder MOOS application and IvP Behavior, with a working CMake build structure, (b) a brief overview of the MOOS application class with an example application, (c) an overview of the IvP Behavior class with an example behavior, and (d) the IvPBuild Toolbox for generation of objective functions within behaviors.

Guaranteed in-order packet delivery using Exclusive Dynamic Virtual Channel Allocation

2009年8月18日 00:00:00 GMT

Guaranteed in-order packet delivery using Exclusive Dynamic Virtual Channel Allocation Devadas, Srinivas; Cho, Myong Hyon; Shim, Keun Sup; Lis, Mieszko In-order packet delivery, a critical abstraction for many higher-level protocols, can severely limit the performance potential in low-latency networks (common, for example, in network-on-chip designs with many cores). While basic variants of dimension-order routing guarantee in-order delivery, improving performance by adding multiple dynamically allocated virtual channels or using other routing schemes compromises this guarantee. Although this can be addressed by reordering out-of-order packets at the destination core, such schemes incur significant overheads, and, in the worst case, raise the specter of deadlock or require expensive retransmission. We present Exclusive Dynamic VCA, an oblivious virtual channel allocation scheme which combines the performance advantages of dynamic virtual allocation with in-network, deadlock-free in-order delivery. At the same time, our scheme reduces head-of-line blocking, often significantly improving throughput compared to equivalent baseline (out-of-order) dimension-order routing when multiple virtual channels are used, and so may be desirable even when in-order delivery is not required. Implementation requires only minor, inexpensive changes to traditional oblivious dimension-order router architectures, more than offset by the removal of packet reorder buffers and logic.

Application Heartbeats for Software Performance and Health

2009年8月07日 00:00:00 GMT

Application Heartbeats for Software Performance and Health Miller, Jason; Agarwal, Anant; Santambrogio, Marco; Eastep, Jonathan; Hoffmann, Henry Adaptive, or self-aware, computing has been proposed as one method to help application programmers confront the growing complexity of multicore software development. However, existing approaches to adaptive systems are largely ad hoc and often do not manage to incorporate the true performance goals of the applications they are designed to support. This paper presents an enabling technology for adaptive computing systems: Application Heartbeats. The Application Heartbeats framework provides a simple, standard programming interface that applications can use to indicate their performance and system software (and hardware) can use to query an applicationâ s performance. Several experiments demonstrate the simplicity and efficacy of the Application Heartbeat approach. First the PARSEC benchmark suite is instrumented with Application Heartbeats to show the broad applicability of the interface. Then, an adaptive H.264 encoder is developed to show how applications might use Application Heartbeats internally. Next, an external resource scheduler is developed which assigns cores to an application based on its performance as specified with Application Heartbeats. Finally, the adaptive H.264 encoder is used to illustrate how Application Heartbeats can aid fault tolerance.

CG2Real: Improving the Realism of Computer Generated Images using a Large Collection of Photographs

2009年7月15日 00:00:00 GMT

CG2Real: Improving the Realism of Computer Generated Images using a Large Collection of Photographs Pfister, Hanspeter; Freeman, William T.; Avidan, Shai; Dale, Kevin; Johnson, Micah K.; Matusik, Wojciech Computer Graphics (CG) has achieved a high level of realism, producing strikingly vivid images. This realism, however, comes at the cost of long and often expensive manual modeling, and most often humans can still distinguish between CG images and real images. We present a novel method to make CG images look more realistic that is simple and accessible to novice users. Our system uses a large collection of photographs gathered from online repositories. Given a CG image, we retrieve a small number of real images with similar global structure. We identify corresponding regions between the CG and real images using a novel mean-shift cosegmentation algorithm. The user can then automatically transfer color, tone, and texture from matching regions to the CG image. Our system only uses image processing operations and does not require a 3D model of the scene, making it fast and easy to integrate into digital content creation workflows. Results of a user study show that our improved CG images appear more realistic than the originals.

The Guided Improvement Algorithm for Exact, General-Purpose, Many-Objective Combinatorial Optimization

2009年7月03日 00:00:00 GMT

The Guided Improvement Algorithm for Exact, General-Purpose, Many-Objective Combinatorial Optimization Jackson, Daniel; Estler, H.-Christian; Rayside, Derek This paper presents a new general-purpose algorithm for exact solving of combinatorial many-objective optimization problems. We call this new algorithm the guided improvement algorithm. The algorithm is implemented on top of the non-optimizing relational constraint solver Kodkod. We compare the performance of this new algorithm against two algorithms from the literature [Gavanelli 2002, Lukasiewycz et alia 2007, Laumanns et alia 2006]) on three micro-benchmark problems (n-Queens, n-Rooks, and knapsack) and on two aerospace case studies. Results indicate that the new algorithm is better for the kinds of many-objective problems that our aerospace collaborators are interested in solving. The new algorithm returns Pareto-optimal solutions as it computes.

Programming Manifolds

2007年1月01日 00:00:00 GMT

Programming Manifolds Bachrach, Jonathan; Beal, Jacob Many programming domains involve the manipulation of values distributed through a manifold - examples include sensor networks, smart materials, and biofilms. This paper describes a programming semantics for manifolds based on the amorphous medium abstraction, which places a computational device at every point in the manifold. This abstraction enables the creation of programs that automatically scale to networks of different size and device density. This semantics is currently implemented in our language Proto and compiles for execution on Mica2 Motes and several other platforms.

Interactive Visual Histories for Vector Graphics

2009年6月24日 00:00:00 GMT

Interactive Visual Histories for Vector Graphics Scull, Craig; Johnson, Steve; Aliaga, Frederick; Paris, Sylvain; Su, Sara L.; Durand, Fredo Presentation and graphics software enables users to experiment with variations of illustrations. They can revisit recent editing operations using the ubiquitous undo command, but they are limited to sequential exploration. We propose a new interaction metaphor and visualization for operation history. While editing, a user can access a history mode in which actions are denoted by graphical depictions appearing on top of the document. Our work is inspired by the visual language of film storyboards and assembly instructions. Our storyboard provides an interactive visual history, summarizing the editing of a document or a selected object. Each view is composed of action depictions representing the userâ s editing actions and enables the user to consider the operation history in context rather than in a disconnected list view. This metaphor provides instant access to any past action and we demonstrate that this is an intuitive interface to a selective undo mechanism.

Enhanced Visual Authoring Using Operation History

2009年6月24日 00:00:00 GMT

Enhanced Visual Authoring Using Operation History Su, Sara L. Graphical editors have introduced great flexibility to the designer's workflow, providing powerful digital tools and enabling the creation of complex and compelling designs. This thesis presents methods for improving these interactions by leveraging operation history. Much instrumentation and activity logging in software has been for the purpose of debugging, that is, for the benefit of the programmer or analyst. Our work addresses the mining of operation history for the benefit of the end user. We present three main contributions in this area. First, we introduce selection expansion, a method for facilitating the reuse of complex multiple-item selections by identifying items that are likely to be edited together. We then discuss an extension of this work, soft grouping, which gives users more control than standard selection and more flexibility than standard grouping. Finally, we present an interactive visualization of operation history, interactive storyboards, which enables in-context browsing and manipulation of operation history. We demonstrate these approaches in the context of vector graphics editing and present the results of pilot studies using our software implementation. While this thesis focuses on the usage patterns of graphic designers, many of the strategies could be generalized to other domains. PhD thesis

An integrated model of visual attention using shape-based features

2009年6月20日 00:00:00 GMT

An integrated model of visual attention using shape-based features Poggio, Tomaso; Serre, Thomas; Tan, Cheston; Chikkerur, Sharat Apart from helping shed some light on human perceptual mechanisms, modeling visual attention has important applications in computer vision. It has been shown to be useful in priming object detection, pruning interest points, quantifying visual clutter as well as predicting human eye movements. Prior work has either relied on purely bottom-up approaches or top-down schemes using simple low-level features. In this paper, we outline a top-down visual attention model based on shape-based features. The same shape-based representation is used to represent both the objects and the scenes that contain them. The spatial priors imposed by the scene and the feature priors imposed by the target object are combined in a Bayesian framework to generate a task-dependent saliency map. We show that our approach can predict the location of objects as well as match eye movements (92% overlap with human observers). We also show that the proposed approach performs better than existing bottom-up and top-down computational models.

An Overview of MOOS-IvP and a Brief Users Guide to the IvP Helm Autonomy Software

2009年6月18日 00:00:00 GMT

An Overview of MOOS-IvP and a Brief Users Guide to the IvP Helm Autonomy Software Benjamin, Michael R.; Leonard, John J.; Schmidt, Henrik; Newman, Paul M. This document describes the IvP Helm - an Open Source behavior-based autonomy application for unmanned vehicles. IvP is short for interval programming - a technique for representing and solving multi-objective optimizations problems. Behaviors in the IvP Helm are reconciled using multi-objective optimization when in competition with each other for influence of the vehicle. The IvP Helm is written as a MOOS application where MOOS is a set of Open Source publish-subscribe autonomy middleware tools. This document describes the configuration and use of the IvP Helm, provides examples of simple missions and information on how to download and build the software from the MOOS-IvP server at www.moosivp.org.

Keeping Mobile Robots Connected

2009年6月17日 00:00:00 GMT

Keeping Mobile Robots Connected Lynch, Nancy; Ley-Wild, Ruy; Kuhn, Fabian; Cornejo, Alejandro Designing robust algorithms for mobile agents with reliable communication is difficult due to the distributed nature of computation, in mobile ad hoc networks (MANETs) the matter is exacerbated by the need to ensure connectivity. Existing distributed algorithms provide coordination but typically assume connectivity is ensured by other means. We present a connectivity service that encapsulates an arbitrary motion planner and can refine any plan to preserve connectivity (the graph of agents remains connected) and ensure progress (the agents advance towards their goal). The service is realized by a distributed algorithm that is modular in that it makes no assumptions of the motion-planning mechanism except the ability for an agent to query its position and intended goal position, local in that it uses 1-hop broadcast to communicate with nearby agents but doesn't need any network routing infrastructure, and \emph{oblivious} in that it does not depend on previous computations. We prove the progress of the algorithm in one round is at least Omega(min(d,r)), where d is the minimum distance between an agent and its target and r is the communication radius. We characterize the worst case configuration and show that when d >= r this bound is tight and the algorithm is optimal, since no algorithm can guarantee greater progress. Finally we show all agents get epsilon-close to their targets within O(D_0/r+n^2/epsilon) rounds where n is the number of agents and D_0 is the initial distance to the targets.

Partitioning Strategies for Concurrent Programming

2009年6月16日 00:00:00 GMT

Partitioning Strategies for Concurrent Programming Devadas, Srinivas; Agarwal, Anant; Hoffmann, Henry This work presents four partitioning strategies, or patterns, useful for decomposing a serial application into multiple concurrently executing parts. These partitioning strategies augment the commonly used task and data parallel design patterns by recognizing that applications are spatiotemporal in nature. Therefore, data and instruction decomposition are further distinguished by whether the partitioning is done in the spatial or in temporal dimension. Thus, this work describes four decomposition strategies: spatial data partitioning (SDP), temporal data partitioning (TDP), spatial instruction partitioning (SIP), and temporal instruction partitioning (TIP), while cataloging the benefits and drawbacks of each. In addition, the practical use of these strategies is demonstrated through a case study in which they are applied to implement several different parallelizations of a multicore H.264 encoder for HD video. This case study illustrates both the application of the patterns and their effects on the performance of the encoder.

A Useful Homomorphic Encryption Method

2009年6月15日 00:00:00 GMT

A Useful Homomorphic Encryption Method Micali, Silvio

Simple LCD Transmitter Camera Receiver Data Link

2009年6月15日 00:00:00 GMT

Simple LCD Transmitter Camera Receiver Data Link Katabi, Dina; Raskar, Ramesh; Mohan, Ankit; Woo, Grace We demonstrate a freespace optical system using a consumer camera and projector in indoor environments using available devices for visual computing. Through design, prototype and experimentation with this commodity hardware, we analyze a practical optical solution as well as the drawbacks for current wireless challenges unmet by classic RF wireless communication. We summarize and introduce some new applications enabled by such similar setups.

Coherent Reaction

2009年6月12日 00:00:00 GMT

Coherent Reaction Edwards, Jonathan Side effects are both the essence and bane of imperative programming. The programmer must carefully coordinate actions to manage their side effects upon each other. Such coordination is complex, error-prone, and fragile. Coherent reaction is a new model of change-driven computation that coordinates effects automatically. State changes trigger events called reactions that in turn change other states. A coherent execution order is one in which each reaction executes before any others that are affected by its changes. A coherent order is discovered iteratively by detecting incoherencies as they occur and backtracking their effects. Unlike alternative solutions, much of the power of imperative programming is retained, as is the common sense notion of mutable state. Automatically coordinating actions lets the programmer express what to do, not when to do it. Coherent reactions are embodied in the Coherence language, which is specialized for interactive applications like those common on the desktop and web. The fundamental building block of Coherence is the dynamically typed mutable tree. The fundamental abstraction mechanism is the virtual tree, whose value is lazily computed, and whose behavior is generated by coherent reactions.

Modeling Radio Networks

2009年6月04日 00:00:00 GMT

Modeling Radio Networks Lynch, Nancy; Newport, Calvin We describe a modeling framework and collection of foundational composition results for the study of probabilistic distributed algorithms in synchronous radio networks. Existing results in this setting rely on informal descriptions of the channel behavior and therefore lack easy comparability and are prone to error caused by definition subtleties. Our framework rectifies these issues by providing: (1) a method to precisely describe a radio channel as a probabilistic automaton; (2) a mathematical notion of implementing one channel using another channel, allowing for direct comparisons of channel strengths and a natural decomposition of problems into implementing a more powerful channel and solving the problem on the powerful channel; (3) a mathematical definition of a problem and solving a problem; (4) a pair of composition results that simplify the tasks of proving properties about channel implementation algorithms and combining problems with channel implementations. Our goal is to produce a model streamlined for the needs of the radio network algorithms community.

Gradient Clock Synchronization in Dynamic Networks

2009年5月29日 00:00:00 GMT

Gradient Clock Synchronization in Dynamic Networks Locher, Thomas; Kuhn, Fabian; Oshman, Rotem Over the last years, large-scale decentralized computer networks such as peer-to-peer and mobile ad hoc networks have become increasingly prevalent. The topologies of many of these networks are often highly dynamic. This is especially true for ad hoc networks formed by mobile wireless devices. In this paper, we study the fundamental problem of clock synchronization in dynamic networks. We show that there is an inherent trade-off between the skew S guaranteed along sufficiently old links and the time needed to guarantee a small skew along new links. For any sufficiently large initial skew on a new link, there are executions in which the time required to reduce the skew on the link to O(S) is at least Omega(n/S). We show that this bound is tight for moderately small values of S. Assuming a fixed set of n nodes and an arbitrary pattern of edge insertions and removals, a weak dynamic connectivity requirement suffices to prove the following results. We present an algorithm that always maintains a skew of O(n) between any two nodes in the network. For a parameter S = Omega(sqrt{rho n}), where rho is the maximum hardware clock drift, it is further guaranteed that if a communication link between two nodes u, v persists in the network for at least Omega(n/S) time, the clock skew between u and v is reduced to no more than O(S).

Sepia: a Framework for Natural Language Semantics

2009年5月28日 00:00:00 GMT

Sepia: a Framework for Natural Language Semantics Marton, Gregory Adam; Westrick, Linda Brown To help explore linguistic semantics in the context of computational natural language understanding, Sepia provides a realization the central theoretical idea of categorial grammar: linking words and phrases to compositional lambda semantics. The Sepia framework provides a language in which to express complex transformations from text to data structures, and tools surrounding that language for parsing and machine learning. Lambda semantics are expressed as arbitrary Scheme programs, unlimited in the semantic representations they may build, and the rules for transformation are expressed in Combinatory Categorial Grammar, though the details of grammar formalism may be easily changed. This report explains the major design decisions, and is meant to teach the reader how to understand Sepia semantics and how to create lexical items for a new language understanding task. Source code and technical description

Scene Classification with a Biologically Inspired Method

2009年5月10日 00:00:00 GMT

Scene Classification with a Biologically Inspired Method Terashima, Yoshito We present a biologically motivated method for scene image classification. The core of the method is to use shape based image property that is provided by a hierarchical feedforward model of the visual cortex [18]. Edge based and color based image properties are additionally used to improve the accuracy. The method consists of two stages of image analysis. In the first stage, each of three paths of classification uses each image property (i.e. shape, edge or color based features) independently. In the second stage, a single classifier assigns the category of an image based on the probability distributions of the first stage classifier outputs. Experiments show that the method boosts the classification accuracy over the shape based model. We demonstrate that this method achieves a high accuracy comparable to other reported methods on publicly available color image dataset.

The Abstract MAC Layer

2009年5月11日 00:00:00 GMT

The Abstract MAC Layer Kuhn, Fabian; Newport, Calvin; Lynch, Nancy A diversity of possible communication assumptions complicates the study of algorithms and lower bounds for radio networks. We address this problem by defining an Abstract MAC Layer. This service provides reliable local broadcast communication, with timing guarantees stated in terms of a collection of abstract \emph{delay functions} applied to the relevant contention. Algorithm designers can analyze their algorithms in terms of these functions, independently of specific channel behavior. Concrete implementations of the Abstract MAC Layer over basic radio network models generate concrete definitions for these delay functions, automatically adapting bounds proven for the abstract service to bounds for the specific radio network under consideration. To illustrate this approach, we use the Abstract MAC Layer to study the new problem of Multi-Message Broadcast, a generalization of standard single-message broadcast, in which any number of messages arrive at any processes at any times.We present and analyze two algorithms for Multi-Message Broadcast in static networks: a simple greedy algorithm and one that uses regional leaders. We then indicate how these results can be extended to mobile networks.

4D Frequency Analysis of Computational Cameras for Depth of Field Extension

2009年5月08日 00:00:00 GMT

4D Frequency Analysis of Computational Cameras for Depth of Field Extension Levin, Anat; Hasinoff, Samuel W.; Freeman, William T.; Green, Paul; Durand, Fredo Depth of field (DOF), the range of scene depths that appear sharp in a photograph, poses a fundamental tradeoff in photography---wide apertures are important to reduce imaging noise, but they also increase defocus blur. Recent advances in computational imaging modify the acquisition process to extend the DOF through deconvolution. Because deconvolution quality is a tight function of the frequency power spectrum of the defocus kernel, designs with high spectra are desirable. In this paper we study how to design effective extended-DOF systems, and show an upper bound on the maximal power spectrum that can be achieved. We analyze defocus kernels in the 4D light field space and show that in the frequency domain, only a low-dimensional 3D manifold contributes to focus. Thus, to maximize the defocus spectrum, imaging systems should concentrate their limited energy on this manifold. We review several computational imaging systems and show either that they spend energy outside the focal manifold or do not achieve a high spectrum over the DOF. Guided by this analysis we introduce the lattice-focal lens, which concentrates energy at the low-dimensional focal manifold and achieves a higher power spectrum than previous designs. We have built a prototype lattice-focal lens and present extended depth of field results.

ATAC: A Manycore Processor with On-Chip Optical Network

2009年5月05日 00:00:00 GMT

ATAC: A Manycore Processor with On-Chip Optical Network Liu, Jifeng; Psota, James; Beckmann, Nathan; Miller, Jason; Michel, Jurgen; Eastep, Jonathan; Kurian, George; Kimerling, Lionel; Agarwal, Anant; Beals, Mark Ever since industry has turned to parallelism instead of frequency scaling to improve processor performance, multicore processors have continued to scale to larger and larger numbers of cores. Some believe that multicores will have 1000 cores or more by the middle of the next decade. However, their promise of increased performance will only be reached if their inherent scaling and programming challenges are overcome. Meanwhile, recent advances in nanophotonic device manufacturing are making chip-stack optics a reality; interconnect technology which can provide significantly more bandwidth at lower power than conventional electrical analogs. Perhaps more importantly, optical interconnect also has the potential to enable new, easy-to-use programming models enabled by an inexpensive broadcast mechanism. This paper introduces ATAC, a new manycore architecture that capitalizes on the recent advances in optics to address a number of the challenges that future manycore designs will face. The new constraints and opportunities associated with on-chip optical interconnect are presented and explored in the design of ATAC. Furthermore, this paper introduces ACKwise, a novel directory-based cache coherence protocol that takes advantage of the special properties of ATAC to achieve high performance and scalability on large-scale manycores. Early performance results show that a 1000-core ATAC chip achieves a speedup of as much as 39% when compared with a similarly sized manycore with an electrical mesh network.

Remote Store Programming: Mechanisms and Performance

2009年5月05日 00:00:00 GMT

Remote Store Programming: Mechanisms and Performance Wentzlaff, David; Agarwal, Anant; Hoffmann, Henry This paper presents remote store programming (RSP). This paradigm combines usability and efficiency through the exploitation of a simple hardware mechanism, the remote store, which can easily be added to existing multicores.Remote store programs are marked by fine-grained and one-sided communication which results in a stream of data flowing from the registers of a sending process to the cache of a destination process. The RSP model and its hardware implementation trade a relatively high store latency for a low load latency because loads are more common than stores, and it is easier to tolerate store latency than load latency. This paper demonstrates the performance advantages of remote store programming by comparing it to both cache-coherent shared memory and direct memory access (DMA) based approaches using the TILEPro64 processor. The paper studies two applications: a two-dimensional Fast Fourier Transform (2D FFT) and an H.264 encoder for high-definition video. For a 2D FFT using 56 cores, RSP is 1.64x faster than DMA and 4.4x faster than shared memory. For an H.264 encoder using 40 cores, RSP achieves the same performance as DMA and 4.8x the performance of shared memory. Along with these performance advantages, RSP requires the least hardware support of the three. RSP's features, performance, and hardware simplicity make it well suited to the embedded processing domain.

Risk Allocation for Multi-agent Systems using Tatonnement

2009年4月22日 00:00:00 GMT

Risk Allocation for Multi-agent Systems using Tatonnement Williams, Brian C.; Ono, Masahiro This paper proposes a new market-based distributed planning algorithm for multi-agent systems under uncertainty, called MIRA (Market-based Iterative Risk Allocation). In large coordination problems, from power grid management to multi-vehicle missions, multiple agents act collectively in order to optimize the performance of the system, while satisfying mission constraints. These optimal plans are particularly susceptible to risk when uncertainty is introduced. We present a distributed planning algorithm that minimizes the system cost while ensuring that the probability of violating mission constraints is below a user-specified level. We build upon the paradigm of risk allocation (Ono and Williams, AAAI-08), in which the planner optimizes not only the sequence of actions, but also its allocation of risk among each constraint at each time step. We extend the concept of risk allocation to multi-agent systems by highlighting risk as a good that is traded in a computational market. The equilibrium price of risk that balances the supply and demand is found by an iterative price adjustment process called tatonnement (also known as Walrasian auction). The simulation results demonstrate the efficiency and optimality of the proposed distributed planner.

Computing Network Coordinates in the Presence of Byzantine Faults

2009年4月16日 00:00:00 GMT

Computing Network Coordinates in the Presence of Byzantine Faults Zhou, You Network coordinate systems allow for efficient construction of large-scale distributed systems on the Internet. Coordinates provide locality information in a compact way, without requiring each node to contact every potential neighbor; distances between two nodes' coordinates represent estimates of the network latency between them. Past work on network coordinates has assumed that all nodes in the system behave correctly. The techniques in these systems do not behave well when nodes are Byzantine. These Byzantine failures, wherein a faulty node can behave arbitrarily, can make the coordinate-based distance estimates meaningless. For example, a Byzantine node can delay responding to some other node, thus distorting that node's computation of its own location. We present a network coordinate system based on landmarks, reference nodes that are used for measurements, some of which may be Byzantine faulty. It scales linearly in the number of clients computing their coordinates and does not require excessive network traffic to allow clients to do so. Our results show that our system is able to compute accurate coordinates even when some landmarks are exhibiting Byzantine faults. MEng thesis

Understanding and evaluating blind deconvolution algorithms

2009年3月31日 00:00:00 GMT

Understanding and evaluating blind deconvolution algorithms Freeman, William; Durand, Fredo; Weiss, Yair; Levin, Anat Blind deconvolution is the recovery of a sharp version of a blurred image when the blur kernel is unknown. Recent algorithms have afforded dramatic progress, yet many aspects of the problem remain challenging and hard to understand.The goal of this paper is to analyze and evaluate recent blind deconvolution algorithms both theoretically and experimentally. We explain the previously reported failure of the naive MAP approach by demonstrating that it mostly favors no-blur explanations. On the other hand we show that since the kernel size is often smaller than the image size a MAP estimation of the kernel alone can be well constrained and accurately recover the true blur. The plethora of recent deconvolution techniques makes an experimental evaluation on ground-truth data important. We have collected blur data with ground truth and compared recent algorithms under equal settings. Additionally, our data demonstrates that the shift-invariant blur assumption made by most algorithms is often violated.

Fragment Grammars: Exploring Computation and Reuse in Language

2009年3月31日 00:00:00 GMT

Fragment Grammars: Exploring Computation and Reuse in Language O'Donnell, Timothy J.; Tenenbaum, Joshua B.; Goodman, Noah D. Language relies on a division of labor between stored units and structure building operations which combine the stored units into larger structures. This division of labor leads to a tradeoff: more structure-building means less need to store while more storage means less need to compute structure. We develop a hierarchical Bayesian model called fragment grammar to explore the optimum balance between structure-building and reuse. The model is developed in the context of stochastic functional programming (SFP) and in particular using a probabilistic variant of Lisp known as the Church programming language (Goodman, Mansinghka, Roy, Bonawitz, & Tenenbaum, 2008). We show how to formalize several probabilistic models of language structure using Church, and how fragment grammar generalizes one of them---adaptor grammars (Johnson, Griffiths, & Goldwater, 2007). We conclude with experimental data with adults and preliminary evaluations of the model on natural language corpus data.

Representing Small Group Evolution

2009年3月30日 00:00:00 GMT

Representing Small Group Evolution Wormald, Nicholas; Richards, Whitman Understanding the dynamics of network evolution rests in part on the representation chosen to characterize the evolutionary process. We offer a simple, three-parameter representation based on subgraphs that capture three important properties of social networks: leadership, team alignment or bonding among members, and diversity of expertise. When plotted on this representation, the evolution of a typical small group such as start-ups or street gangs has a spiral trajectory, moving toward a tentative fixed point as membership increases to two dozen or so. We show that a simple probabilistic model for recruitment and bonding can not explain these observations, and suggest that strategic moves among group members may come into play.

Oblivious Routing in On-Chip Bandwidth-Adaptive Networks

2009年3月27日 00:00:00 GMT

Oblivious Routing in On-Chip Bandwidth-Adaptive Networks Kinsy, Michel; Wen, Tina; Shim, Keun Sup; Lis, Mieszko; Cho, Myong Hyon; Devadas, Srinivas Oblivious routing can be implemented on simple router hardware, but network performance suffers when routes become congested. Adaptive routing attempts to avoid hot spots by re-routing flows, but requires more complex hardware to determine and configure new routing paths. We propose on-chip bandwidth-adaptive networks to mitigate the performance problems of oblivious routing and the complexity issues of adaptive routing.In a bandwidth-adaptive network, the bisection bandwidth of a network can adapt to changing network conditions. We describe one implementation of a bandwidth-adaptive network in the form of a two-dimensional mesh with adaptive bidirectional links, where the bandwidth of the link in one direction can be increased at the expense of the other direction. Efficient local intelligence is used to reconfigure each link, and this reconfiguration can be done very rapidly in response to changing traffic demands. We compare the hardware designs of a unidirectional and bidirectional link and evaluate the performance gains provided by a bandwidth-adaptive network in comparison to a conventional network under uniform and bursty traffic when oblivious routing is used.

Finding Bugs in Web Applications Using Dynamic Test Generation and Explicit State Model Checking

2009年3月26日 00:00:00 GMT

Finding Bugs in Web Applications Using Dynamic Test Generation and Explicit State Model Checking Tip, Frank; Ernst, Michael D.; Dig, Danny; Dolby, Julian; Kiezun, Adam; Artzi, Shay; Paradkar, Amit Web script crashes and malformed dynamically-generated web pages are common errors, and they seriously impact the usability of web applications. Current tools for web-page validation cannot handle the dynamically generated pages that are ubiquitous on today's Internet. We present a dynamic test generation technique for the domain of dynamic web applications. The technique utilizes both combined concrete and symbolic execution and explicit-state model checking. The technique generates tests automatically, runs the tests capturing logical constraints on inputs, and minimizes the conditions on the inputs to failing tests, so that the resulting bug reports are small and useful in finding and fixing the underlying faults. Our tool Apollo implements the technique for the PHP programming language. Apollo generates test inputs for a web application, monitors the application for crashes, and validates that the output conforms to the HTML specification. This paper presents Apollo's algorithms and implementation, and an experimental evaluation that revealed 302 faults in 6 PHP web applications.

The Abstract MAC Layer

2009年2月21日 00:00:00 GMT

The Abstract MAC Layer Newport, Calvin; Lynch, Nancy; Kuhn, Fabian A diversity of possible communication assumptions complicates the study of algorithms and lower bounds for radio networks. We address this problem by defining an Abstract MAC Layer. This service provides reliable local broadcast communication, with timing guarantees stated in terms of a collection of abstract delay functions applied to the relevant contention. Algorithm designers can analyze their algorithms in terms of these functions, independently of specific channel behavior. Concrete implementations of the Abstract MAC Layer over basic radio network models generate concrete definitions for these delay functions, automatically adapting bounds proven for the abstract service to bounds for the specific radio network under consideration. To illustrate this approach, we use the Abstract MAC Layer to study the new problem of Multi-Message Broadcast, a generalization of standard single-message broadcast, in which any number of messages arrive at any processes at any times.We present and analyze two algorithms for Multi-Message Broadcast in static networks: a simple greedy algorithm and one that uses regional leaders. We indicate how these results can be extended to mobile networks.

Overcoming the Antennas-Per-Node Throughput Limit in MIMO LANs

2009年2月18日 00:00:00 GMT

Overcoming the Antennas-Per-Node Throughput Limit in MIMO LANs Perli, Samuel David; Gollakota, Shyamnath; Katabi, Dina Today, the number of concurrent packets in a MIMO LAN is limited by the number of antennas on the AP. This paper shows how to overcome this limit. It presents a new design where multiple client-AP pairs can communicate concurrently, on the same 802.11 channel. We demonstrate both analytically and experimentally that our design almost doubles the throughput of a MIMO LAN. The key idea underlying our approach is Interference Alignment and Cancellation (IAC), a novel technique for decoding concurrent sender-receiver pairs in MIMO LANs. It exploits two basic properties of MIMO LANs. First, MIMO transmitters can control the alignment of their signals at a receiver. Second, APs are typically connected to a backend Ethernet, which they can use for coordination. Hence, in IAC, transmitters align their signals such that the first AP can decode at least one of the concurrent packets. Once a packet is decoded, it is sent over the Ethernet to the second AP, which subtracts it from its received signal to decode a second packet, which it sends to the third AP to decode the next packet, and so on. We implement our technique in 2x2 MIMO GNU Radios, and demonstrate via wireless experiments that IAC increases the average throughput of a MIMO LAN by 1.5x on the downlink and 2x on the uplink.

Automatic Class-Specific 3D Reconstruction from a Single Image

2009年2月18日 00:00:00 GMT

Automatic Class-Specific 3D Reconstruction from a Single Image Lozano-Perez, Tomas; Kaelbling, Leslie Pack; Chiu, Han-Pang Our goal is to automatically reconstruct 3D objects from a single image, by using prior 3D shape models of classes. The shape models, defined as a collection of oriented primitive shapes centered at fixed 3D positions, can be learned from a few labeled images for each class. The 3D class model can then be used to estimate the 3D shape of an object instance, including occluded parts, from a single image. We provide a quantitative evaluation of the shape estimation process on real objects and demonstrate its usefulness in three applications: robot manipulation, object detection, and generating 3D 'pop-up' models from photos.

A Tour of MOOS-IvP Autonomy Software Modules

2009年2月13日 00:00:00 GMT

A Tour of MOOS-IvP Autonomy Software Modules Benjamin, Michael R.; Leonard, John J.; Schmidt, Henrik; Newman, Paul M. This paper provides an overview of the MOOS-IvP autonomy software modules. The MOOS-IvP collection of software, i.e., codebase, described here has been developed and is currently maintained by three organizations - Oxford University, Massachusetts Institute of Technology (MIT), and the Naval Undersea Warfare Center (NUWC) Division Newport Rhode Island. The objective of this paper is to provide a comprehensive list of modules and provide for each (a) a general description of functionality, (b) dependency relationships to other modules, (c) rough order of magnitude in complexity or size, (d) authorship, and (e) current and planned distribution access.

SoftCast: One Video to Serve All Wireless Receivers

2009年2月07日 00:00:00 GMT

SoftCast: One Video to Serve All Wireless Receivers Katabi, Dina; Rahul, Hariharan; Jakubczak, Szymon The main challenge in wireless video multicast is to scalably serve multiple receivers who have different channel characteristics. Current wireless transmission schemes, however, cannot support smooth degradation. Specifically, each packet is transmitted at a particular bitrate and is decodable only by receivers that support the chosen bitrate. Broadcasting a video stream to all receivers requires transmitting at the lowest bitrate, and hence reduces everyone to the performance of the worst receiver in the multicast group.This paper introduces SoftCast, an alternative design for wireless video multicast, in which a sender broadcasts a single stream and each receiver watches a video quality that matches its channel quality. SoftCast achieves this by making the magnitude of the transmitted signal proportional to the pixel value. Hence, channel noise directly translates to a small perturbation in pixel values, allowing graceful degradation with increasing noise. SoftCast introduces a novel power allocation scheme that allows the transmission of real-valued video signals in a compact and resilient manner. We implement SoftCast in the WARP radio platform. Our results show that SoftCast improves the average video quality across multicast receivers by 3-7dB over the current approach. Further, it stays competitive with the current approach even for regular unicast.

HAMPI: A Solver for String Constraints

2009年2月04日 00:00:00 GMT

HAMPI: A Solver for String Constraints Ernst, Michael D.; Kiezun, Adam; Ganesh, Vijay; Guo, Philip J.; Hooimeijer, Pieter Many automatic testing, analysis, and verification techniques for programs can be effectively reduced to a constraint-generation phase followed by a constraint-solving phase. This separation of concerns often leads to more effective and maintainable tools. The increasing efficiency of off-the-shelf constraint solvers makes this approach even more compelling. However, there are few, if any, effective and sufficiently expressive off-the-shelf solvers for string constraints generated by analysis techniques for string-manipulating programs. We designed and implemented Hampi, a solver for string constraints over bounded string variables. Hampi constraints express membership in regular languages and bounded context-free languages. Hampi constraints may contain context-free-language definitions, regular-language definitions and operations, and the membership predicate. Given a set of constraints, Hampi outputs a string that satisfies all the constraints, or reports that the constraints are unsatisfiable. Hampi is expressive and efficient, and can be successfully applied to testing and analysis of real programs. Our experiments use Hampi in: static and dynamic analyses for finding SQL injection vulnerabilities in Web applications; automated bug finding in C programs using systematic testing; and compare Hampi with another string solver. Hampi's source code, documentation, and the experimental data are available at http://people.csail.mit.edu/akiezun/hampi.

Self-Stabilizing Message Routing in Mobile ad hoc Networks

2009年1月28日 00:00:00 GMT

Self-Stabilizing Message Routing in Mobile ad hoc Networks Lynch, Nancy; Lahiani, Limor; Dolev, Shlomi; Nolte, Tina We present a self-stabilizing algorithm for routing messages between arbitrary pairs of nodes in a mobile ad hoc network. Our algorithm assumes the availability of a reliable GPS service, which supplies mobile nodes with accurate information about real time and about their own geographical locations. The GPS service provides an external, shared source of consistency for mobile nodes, allowing them to label and timestamp messages, and thereby aiding in recovery from failures. Our algorithm utilizes a Virtual Infrastructure programming abstraction layer, consisting of mobile client nodes, virtual stationary timed machines called Virtual Stationary Automata (VSAs), and a local broadcast service connecting VSAs and mobile clients. VSAs are associated with predetermined regions in the plane, and are emulated in a self-stabilizing manner by the mobile nodes. VSAs are relatively stable in the face of node mobility and failure, and can be used to simplify algorithm development for mobile networks. Our routing algorithm consists of three subalgorithms: [(1)] a VSA-to-VSA geographical routing algorithm, [2] a mobile client location management algorithm, and [3] the main algorithm, which utilizes both location management and geographical routing. All three subalgorithms are self-stabilizing, and consequently, the entire algorithm is also self-stabilizing.

The Art of the Propagator

2009年1月26日 00:00:00 GMT

The Art of the Propagator Sussman, Gerald Jay; Radul, Alexey We develop a programming model built on the idea that the basic computational elements are autonomous machines interconnected by shared cells through which they communicate. Each machine continuously examines the cells it is interested in, and adds information to some based on deductions it can make from information from the others. This model makes it easy to smoothly combine expression-oriented and constraint-based programming; it also easily accommodates implicit incremental distributed search in ordinary programs. This work builds on the original research of Guy Lewis Steele Jr. and was developed more recently with the help of Chris Hanson.

Organic Indoor Location Discovery

2008年12月30日 00:00:00 GMT

Organic Indoor Location Discovery Hicks, Jamey; Curtis, Dorothy; Teller, Seth; Charrow, Ben; Ryan, Russell; Ledlie, Jonathan; Battat, Jonathan We describe an indoor, room-level location discovery method based on spatial variations in "wifi signatures," i.e., MAC addresses and signal strengths of existing wireless access points. The principal novelty of our system is its organic nature; it builds signal strength maps from the natural mobility and lightweight contributions of ordinary users, rather than dedicated effort by a team of site surveyors. Whenever a user's personal device observes an unrecognized signature, a GUI solicits the user's location. The resulting location-tagged signature or "bind" is then shared with other clients through a common database, enabling devices subsequently arriving there to discover location with no further user contribution. Realizing a working system deployment required three novel elements: (1) a human-computer interface for indicating location over intervals of varying duration; (2) a client-server protocol for pre-fetching signature data for use in localization; and (3) a location-estimation algorithm incorporating highly variable signature data. We describe an experimental deployment of our method in a nine-story building with more than 1,400 distinct spaces served by more than 200 wireless access points. At the conclusion of the deployment, users could correctly localize to within 10 meters 92 percent of the time.

Resilient Auctions of One Good in Limited Supply

2008年12月17日 00:00:00 GMT

Resilient Auctions of One Good in Limited Supply Micali, Silvio; Chen, Jing We present various resilient auction mechanisms for a good in limited supply. Our mechanisms achieve both player-knowledge and aggregated player-knowledge benchmarks.

Resilient Provision of a Public and/or Private Good, or: Resilient Auctions of One Good in Unlimited Supply

2008年12月02日 00:00:00 GMT

Resilient Provision of a Public and/or Private Good, or: Resilient Auctions of One Good in Unlimited Supply Chen, Jing; Micali, Silvio We present two resilient mechanisms: the first for the provision of a public good, and the second for the provision of a private good. Both mechanisms adopt a knowledge-based benchmark.

Resilient Provision of a Public Good

2008年12月02日 00:00:00 GMT

Resilient Provision of a Public Good Micali, Silvio; Chen, Jing We present two resilient mechanisms for the provision of a public good. Both mechanisms adopt a knowledge-based benchmark.

Resilient Knowledge-Based Mechanisms For Truly Combinatorial Auctions (And Implementation in Surviving Strategies)

2008年10月08日 00:00:00 GMT

Resilient Knowledge-Based Mechanisms For Truly Combinatorial Auctions (And Implementation in Surviving Strategies) Micali, Silvio; Chen, Jing We put forward a new mechanism achieving a high benchmark for (both revenue and) the sum of revenue and efficiency in truly combinatorial auctions. Notably, our mechanism guarantees its performance (1) in a very adversarial collusion model; (2) for any profile of strategies surviving the iterated elimination of dominated strategies; and (3) by leveraging the knowledge that the players have about each other (in a non-Bayesian setting).Our mechanism also is computationally efficient, and preserves the players' privacy to an unusual extent.

Mathematics of the Neural Response

2008年11月26日 00:00:00 GMT

Mathematics of the Neural Response Caponnetto, Andrea; Poggio, Tomaso; Bouvrie, Jake; Rosasco, Lorenzo; Smale, Steve We propose a natural image representation, the neural response, motivated by the neuroscience of the visual cortex. The inner product defined by the neural response leads to a similarity measure between functions which we call the derived kernel. Based on a hierarchical architecture, we give a recursive definition of the neural response and associated derived kernel. The derived kernel can be used in a variety of application domains such as classification of images, strings of text and genomics data.

Stochastic Digital Circuits for Probabilistic Inference

2008年11月24日 00:00:00 GMT

Stochastic Digital Circuits for Probabilistic Inference Tenenbaum, Joshua B.; Jonas, Eric M.; Mansinghka, Vikash K. We introduce combinational stochastic logic, an abstraction that generalizes deterministic digital circuit design (based on Boolean logic gates) to the probabilistic setting. We show how this logic can be combined with techniques from contemporary digital design to generate stateless and stateful circuits for exact and approximate sampling from a range of probability distributions. We focus on Markov chain Monte Carlo algorithms for Markov random fields, using massively parallel circuits. We implement these circuits on commodity reconfigurable logic and estimate the resulting performance in time, space and price. Using our approach, these simple and general algorithms could be affordably run for thousands of iterations on models with hundreds of thousands of variables in real time.

Modeling Computational Security in Long-Lived Systems, Version 2

2008年11月22日 00:00:00 GMT

Modeling Computational Security in Long-Lived Systems, Version 2 Lynch, Nancy; Pereira, Olivier; Kaynar, Dilsun; Cheung, Ling; Canetti, Ran For many cryptographic protocols, security relies on the assumption that adversarial entities have limited computational power. This type of security degrades progressively over the lifetime of a protocol. However, some cryptographic services, such as timestamping services or digital archives, are long-lived in nature; they are expected to be secure and operational for a very long time (i.e., super-polynomial). In such cases, security cannot be guaranteed in the traditional sense: a computationally secure protocol may become insecure if the attacker has a super-polynomial number of interactions with the protocol. This paper proposes a new paradigm for the analysis of long-lived security protocols. We allow entities to be active for a potentially unbounded amount of real time, provided they perform only a polynomial amount of work per unit of real time. Moreover, the space used by these entities is allocated dynamically and must be polynomially bounded. We propose a new notion of long-term implementation, which is an adaptation of computational indistinguishability to the long-lived setting. We show that long-term implementation is preserved under polynomial parallel composition and exponential sequential composition. We illustrate the use of this new paradigm by analyzing some security properties of the long-lived timestamping protocol of Haber and Kamat.

Resilient Mechanisms For Truly Combinatorial Auctions

2008年11月13日 00:00:00 GMT

Resilient Mechanisms For Truly Combinatorial Auctions Micali, Silvio; Valiant, Paul Dominant-strategy truthfulness is traditionally considered the best possible solution concept in mechanism design, as it enables one to predict with confidence which strategies INDEPENDENT players will actually choose. Yet, as with any other form of equilibrium, it too can be extremely vulnerable to COLLUSION. The problem of collusion is particularly evident for UNRESTRICTED combinatorial auctions}, arguably the hardest type of auctions.We thus investigate how much revenue can be guaranteed, in unrestricted combinatorial auctions, by dominant-strategy-truthful mechanisms that are COLLUSION-RESILIENT in a very strong sense; and obtain almost matching upper- and lower-bounds.

MOOS-IvP Autonomy Tools Users Manual

2008年11月11日 00:00:00 GMT

MOOS-IvP Autonomy Tools Users Manual Benjamin, Michael R. This document describes seven common MOOS-IvP autonomy tools. The uHelmScope application provides a run-time scoping window into the state of an active IvP Helm executing its mission. The pMarineViewer application is a geo-based GUI tool for rendering marine vehicles and certain autonomy properties in their operational area. The uXMS application is a terminal based tool for live scoping on a MOOSDB process. The uTermCommand application is a terminal based tool for poking the MOOSDB with a set of MOOS file pre-defined variable-value pairs selectable with tab-completion of aliases from the command-line. The pEchoVar application provides a way of echoing an observed write to a variable with a new write with the same value to a different variable name. The uProcessWatch application is a way of monitoring the presence or absence of a set of MOOS processes and summarizing the collective status in a single MOOS variable. The uPokeDB application is a way of poking a MOOSDB from the command line with one or more variable-value pairs without any pre-existing configuration of a MOOS file.

Energy Scalability of On-Chip Interconnection Networks in Multicore Architectures

2008年11月11日 00:00:00 GMT

Energy Scalability of On-Chip Interconnection Networks in Multicore Architectures Agarwal, Anant; Psota, James; Eastep, Jonathan; Konstantakopoulos, Theodoros On-chip interconnection networks (OCNs) such as point-to-point networks and buses form the communication backbone in systems-on-a-chip, multicore processors, and tiled processors. OCNs can consume significant portions of a chip's energy budget, so analyzing their energy consumption early in the design cycle becomes important for architectural design decisions. Although numerous studies have examined OCN implementation and performance, few have examined energy. This paper develops an analytical framework for energy estimation in OCNs and presents results based on both analytical models of communication patterns and real network traces from applications running on a tiled multicore processor. Our analytical framework supports arbitrary OCN topologies under arbitrary communication patterns while accounting for wire length, switch energy, and network contention. It is the first to incorporate the effects of communication locality and network contention, and use real traces extensively. This paper compares the energy of point-to-point networks against buses under varying degrees of communication locality. The results indicate that, for 16 or more processors, a one-dimensional and a two-dimensional point-to-point network provide 66% and 82% energy savings, respectively, over a bus assuming that processors communicate with equal likelihood. The energy savings increase for patterns which exhibit locality. For the two-dimensional point-to-point OCN of the Raw tiled microprocessor, contention contributes a maximum of just 23% of the OCN energy, using estimated values for channel, switch control logic, and switch queue buffer energy of 34.5pJ, 17pJ, and 12pJ, respectively. Our results show that the energy-delay product per message decreases with increasing processor message injection rate.

Recursively invoking Linnaeus: A Taxonomy for Naming Systems

2002年3月01日 00:00:00 GMT

Recursively invoking Linnaeus: A Taxonomy for Naming Systems Sollins, Karen R. Naming is a central element of a distributed or network system design. Appropriate design choices are central. This paper explores a taxonomy of naming systems, and engineering tradeoffs as an aid to the namespace designer. The three orthogonal components of the taxonomy are the characteristics of the namespace itself, name assignment, and name resolution. Within each of these, we explore a number of distinct characteristics. The position of this paper is that engineering design of naming systems should be informed by the possibilities and tradeoffs that those possibilities represent. The paper includes a review of a sampling of naming system designs that reflect different choices within the taxonomy and discussion about why those choices were made.

One Video Stream to Serve Diverse Receivers

2008年10月18日 00:00:00 GMT

One Video Stream to Serve Diverse Receivers Woo, Grace; Katabi, Dina; Chachulski, Szymon The fundamental problem of wireless video multicast is to scalably serve multiple receivers which may have very different channel characteristics. Ideally, one would like to broadcast a single stream that allows each receiver to benefit from all correctly received bits to improve its video quality. We introduce Digital Rain, a new approach to wireless video multicast that adapts to channel characteristics without any need for receiver feedback or variable codec rates. Users that capture more packets or have fewer bit errors naturally see higher video quality. Digital Rain departs from current approaches in two ways: 1) It allows a receiver to exploit video packets that may contain bit errors; 2) It builds on the theory of compressed sensing to develop robust video encoding and decoding algorithms that degrade smoothly with bit errors and packet loss. Implementation results from an indoor wireless testbed show that Digital Rain significantly improves the received video quality and the number of supported receivers.

Adaptive Kernel Methods Using the Balancing Principle

2008年10月16日 00:00:00 GMT

Adaptive Kernel Methods Using the Balancing Principle Rosasco, Lorenzo; Pereverzyev, Sergei; De Vito, Ernesto The regularization parameter choice is a fundamental problem in supervised learning since the performance of most algorithms crucially depends on the choice of one or more of such parameters. In particular a main theoretical issue regards the amount of prior knowledge on the problem needed to suitably choose the regularization parameter and obtain learning rates. In this paper we present a strategy, the balancing principle, to choose the regularization parameter without knowledge of the regularity of the target function. Such a choice adaptively achieves the best error rate. Our main result applies to regularization algorithms in reproducing kernel Hilbert space with the square loss, though we also study how a similar principle can be used in other situations. As a straightforward corollary we can immediately derive adaptive parameter choice for various kernel methods recently studied. Numerical experiments with the proposed parameter choice rules are also presented.

Modular Generation and Customization

2008年10月10日 00:00:00 GMT

Modular Generation and Customization Edwards, Jonathan Modularity and flexibility can conflict in multi-language systems. For example, the templates commonly used to generate web pages must be manually updated when the database schema changes. Modularity can be improved by generating web pages automatically from the database schema, but it is hard for such a generator to produce the same variety of outputs that are easily achieved by ad hoc edits to a template. Ideally, such ad hoc edits would be abstracted into transformations that compose with the generator, offering both modularity and flexibility. However common customizations cannot be abstracted using the standard techniques of textual identifiers and ordinal positions. These difficulties are distilled into a challenge problem to evaluate potential solutions. A solution is proposed based on field trees, a new data model for software artifacts that provides persistent identifiers and unshifting positions within sequences. But using field trees with conventional programming languages and development environments requires more effort than the ad hoc editing they seek to supplant. Field trees are therefore extended into differential trees, which integrate artifacts and their transformations into a unified representation.

The Case for a Factored Operating System (fos)

2008年10月08日 00:00:00 GMT

The Case for a Factored Operating System (fos) Agarwal, Anant; Wentzlaff, David The next decade will afford us computer chips with 1,000 - 10,000 cores on a single piece of silicon. Contemporary operating systems have been designed to operate on a single core or small number of cores and hence are not well suited to manage and provide operating system services at such large scale. Managing 10,000 cores is so fundamentally different from managing two cores that the traditional evolutionary approach of operating system optimization will cease to work. The fundamental design of operating systems and operating system data structures must be rethought. This work begins by documenting the scalability problems of contemporary operating systems. These studies are used to motivate the design of a factored operating system (fos). fos is a new operating system targeting 1000+ core multicore systems where space sharing replaces traditional time sharing to increase scalability. fos is built as a collection of Internet inspired services. Each operating system service is factored into a fleet of communicating servers which in aggregate implement a system service. These servers are designed much in the way that distributed Internet services are designed, but instead of providing high level Internet services, these servers provide traditional kernel services and manage traditional kernel data structures in a factored, spatially distributed manner. The servers are bound to distinct processing cores and by doing so do not fight with end user applications for implicit resources such as TLBs and caches. Also, spatial distribution of these OS services facilitates locality as many operations only need to communicate with the nearest server for a given service.

New Resiliency in Truly Combinatorial Auctions (and Implementation in Surviving Strategies)

2008年10月08日 00:00:00 GMT

New Resiliency in Truly Combinatorial Auctions (and Implementation in Surviving Strategies) Chen, Jing; Micali, Silvio Following Micali and Valiant [MV07.a], a mechanism is resilient if it achieves its objective without any problem of (1) equilibrium selection and (2) player collusion. To advance resilient mechanism design,We put forward a new meaningful benchmark for the COMBINED social welfare-revenue performance of any mechanism in truly combinatorial auctions.We put forward a NEW notion of implementation, much more general than the ones used so far, which we believe to be of independent interest.We put forward a new RESILIENT mechanism that, by leveraging the knowledge that the players have about each other, guarantees at least one half of our benchmark under a very general collusion model.

ZigZag Decoding: Combating Hidden Terminals In Wireless Networks

2008年10月01日 00:00:00 GMT

ZigZag Decoding: Combating Hidden Terminals In Wireless Networks Katabi, Dina; Gollakota, Shyamnath This paper presents ZigZag, an 802.11 receiver design that combats hidden terminals. ZigZag's core contribution is a new form of interference cancellation that exploits asynchrony across successive collisions. Specifically, 802.11 retransmissions, in the case of hidden terminals, cause successive collisions. These collisions have different interference-free stretches at their start, which ZigZag exploits to bootstrap its decoding. ZigZag makes no changes to the 802.11 MAC and introduces no overhead when there are no collisions. But, when senders collide, ZigZag attains the same throughput as if the colliding packets were a priori scheduled in separate time slots. We build a prototype of ZigZag in GNU Radio. In a testbed of 14 USRP nodes, ZigZag reduces the average packet loss rate at hidden terminals from 72.6% to about 0.7%.

Refactoring Sequential Java Code for Concurrency via Concurrent Libraries

2008年9月30日 00:00:00 GMT

Refactoring Sequential Java Code for Concurrency via Concurrent Libraries Ernst, Michael D.; Marrero, John; Dig, Danny Parallelizing existing sequential programs to run efficiently on multicores is hard. The Java 5 packagejava.util.concurrent (j.u.c.) supports writing concurrent programs: much of the complexity of writing threads-safe and scalable programs is hidden in the library. To use this package, programmers still need to reengineer existing code. This is tedious because it requires changing many lines of code, is error-prone because programmers can use the wrong APIs, and is omission-prone because programmers can miss opportunities to use the enhanced APIs. This paper presents our tool, CONCURRENCER, which enables programmers to refactor sequential code into parallel code that uses j.u.c. concurrent utilities. CONCURRENCER does not require any program annotations, although the transformations are very involved: they span multiple program statements and use custom program analysis. A find-and-replace tool can not perform such transformations. Empirical evaluation shows that CONCURRENCER refactors code effectively: CONCURRENCER correctly identifies and applies transformations that some open-source developers overlooked, and the converted code exhibits good speedup.

Rank Priors for Continuous Non-Linear Dimensionality Reduction

2008年9月26日 00:00:00 GMT

Rank Priors for Continuous Non-Linear Dimensionality Reduction Stiefelhagen, Rainer; Darrell, Trevor; Urtasun, Raquel; Geiger, Andreas Non-linear dimensionality reduction methods are powerful techniques to deal with high-dimensional datasets. However, they often are susceptible to local minima and perform poorly when initialized far from the global optimum, even when the intrinsic dimensionality is known a priori. In this work we introduce a prior over the dimensionality of the latent space, and simultaneously optimize both the latent space and its intrinsic dimensionality. Ad-hoc initialization schemes are unnecessary with our approach; we initialize the latent space to the observation space and automatically infer the latent dimensionality using an optimization scheme that drops dimensions in a continuous fashion. We report results applying our prior to various tasks involving probabilistic non-linear dimensionality reduction, and show that our method can outperform graph-based dimensionality reduction techniques as well as previously suggested ad-hoc initialization strategies.

Stochastic Combinatorial Optimization with Risk

2008年9月13日 00:00:00 GMT

Stochastic Combinatorial Optimization with Risk Nikolova, Evdokia We consider general combinatorial optimization problems that can be formulated as minimizing the weight of a feasible solution wT x over an arbitrary feasible set. For these problems we describe a broad class of corresponding stochastic problems where the weight vector W has independent random components, unknown at the time of solution. A natural and important objective which incorporates risk in this stochastic setting, is to look for a feasible solution whose stochastic weight has a small tail or a small linear combination of mean and standard deviation. Our models can be equivalently reformulated as deterministic nonconvex programs for which no efficient algorithms are known. In this paper, we make progress on these hard problems. Our results are several efficient general-purpose approximation schemes. They use as a black-box (exact or approximate) the solution to the underlying deterministic combinatorial problem and thus immediately apply to arbitrary combinatorial problems. For example, from an available ?-approximation algorithm to the deterministic problem, we construct a ?(1 + ?)-approximation algorithm that invokes the deterministic algorithm only a logarithmic number of times in the input and polynomial in 1/?, for any desired accuracy level ? > 0. The algorithms are based on a geometric analysis of the curvature and approximability of the nonlinear level sets of the objective functions.

Automatic Creation of SQL Injection and Cross-Site Scripting Attacks

2008年9月10日 00:00:00 GMT

Automatic Creation of SQL Injection and Cross-Site Scripting Attacks Kiezun, Adam; Guo, Philip J.; Jayaraman, Karthick; Ernst, Michael D. We present a technique for finding security vulnerabilitiesin Web applications. SQL Injection (SQLI) and cross-sitescripting (XSS) attacks are widespread forms of attackin which the attacker crafts the input to the application toaccess or modify user data and execute malicious code. Inthe most serious attacks (called second-order, or persistent,XSS), an attacker can corrupt a database so as to causesubsequent users to execute malicious code.This paper presents an automatic technique for creatinginputs that expose SQLI and XSS vulnerabilities. The techniquegenerates sample inputs, symbolically tracks taintsthrough execution (including through database accesses),and mutates the inputs to produce concrete exploits. Oursis the first analysis of which we are aware that preciselyaddresses second-order XSS attacks.Our technique creates real attack vectors, has few falsepositives, incurs no runtime overhead for the deployed application,works without requiring modification of applicationcode, and handles dynamic programming-languageconstructs. We implemented the technique for PHP, in a toolArdilla. We evaluated Ardilla on five PHP applicationsand found 68 previously unknown vulnerabilities (23 SQLI,33 first-order XSS, and 12 second-order XSS).

How do programs become more concurrent? A story of program transformations

2008年9月05日 00:00:00 GMT

How do programs become more concurrent? A story of program transformations Dig, Danny; Marrero, John; Ernst, Michael D. For several decades, programmers have relied onMooreâ s Law to improve the performance of their softwareapplications. From now on, programmers need to programthe multi-cores if they want to deliver efficient code. Inthe multi-core era, a major maintenance task will be tomake sequential programs more concurrent. What are themost common transformations to retrofit concurrency intosequential programs?We studied the source code of 5 open-source Javaprojects. We analyzed qualitatively and quantitatively thechange patterns that developers have used in order toretrofit concurrency. We found that these transformationsbelong to four categories: transformations that improve thelatency, the throughput, the scalability, or correctness of theapplications. In addition, we report on our experience ofparallelizing one of our own programs. Our findings caneducate software developers on how to parallelize sequentialprograms, and can provide hints for tool vendors aboutwhat transformations are worth automating.

Style Translation for Human Motion (Supplemental Material)

2005年8月01日 00:00:00 GMT

Style Translation for Human Motion (Supplemental Material) Hsu, Eugene; Pulli, Kari; Popovic, Jovan Style translation is the process of transforming an input motion into a new style while preserving its original content. This problem is motivated by the needs of interactive applications, which require rapid processing of captured performances. Our solution learns to translate by analyzing differences between performances of the same content in input and output styles. It relies on a novel correspondence algorithm to align motions, and a linear time-invariant model to represent stylistic differences. Once the model is estimated with system identification, our system is capable of translating streaming input with simple linear operations at each frame.

Interactive Simulation of Stylized Human Locomotion

2008年8月01日 00:00:00 GMT

Interactive Simulation of Stylized Human Locomotion Silva, Marco da; Popovic, Jovan; Abe, Yeuhi Animating natural human motion in dynamic environments is difficult because of complex geometric and physical interactions. Simulation provides an automatic solution to parts of this problem, but it needs control systems to produce lifelike motions. This paper describes the systematic computation of controllers that can reproduce a range of locomotion styles in interactive simulations. Given a reference motion that describes the desired style, a derived control system can reproduce that style in simulation and in new environments. Because it produces high-quality motions that are both geometrically and physically consistent with simulated surroundings, interactive animation systems could begin to use this approach with more established kinematic methods.

Guided Time Warping for Motion Editing

2007年8月01日 00:00:00 GMT

Guided Time Warping for Motion Editing Hsu, Eugene; Silva, Marco da; Popovic, Jovan Time warping allows users to modify timing without affecting poses. It has many applications in animation systems for motion editing, such as refining motions to meet new timing constraints or modifying the acting of animated characters. However, time warping typically requires many manual adjustments to achieve the desired results. We present a technique which simplifies this process by allowing time warps to be guided by a provided reference motion. Given few timing constraints, it computes a warp that both satisfies these constraints and maximizes local timing similarities to the reference. The algorithm is fast enough to incorporate into standard animation workflows. We apply the technique to two common tasks: preserving the natural timing of motions under new time constraints and modifying the timing of motions for stylistic effects.

Style Translation for Human Motion

2005年8月01日 00:00:00 GMT

Style Translation for Human Motion Hsu, Eugene; Pulli, Kari; Popovic, Jovan Style translation is the process of transforming an input motion into a new style while preserving its original content. This problem is motivated by the needs of interactive applications, which require rapid processing of captured performances. Our solution learns to translate by analyzing differences between performances of the same content in input and output styles. It relies on a novel correspondence algorithm to align motions, and a linear time-invariant model to represent stylistic differences. Once the model is estimated with system identification, our system is capable of translating streaming input with simple linear operations at each frame.

Example-Based Control of Human Motion

2004年7月01日 00:00:00 GMT

Example-Based Control of Human Motion Hsu, Eugene; Gentry, Sommer; Popovic, Jovan In human motion control applications, the mapping between a control specification and an appropriate target motion often defies an explicit encoding. We present a method that allows such a mapping to be defined by example, given that the control specification is recorded motion. Our method begins by building a database of semantically meaningful instances of the mapping, each of which is represented by synchronized segments of control and target motion. A dynamic programming algorithm can then be used to interpret an input control specification in terms of mapping instances. This interpretation induces a sequence of target segments from the database, which is concatenated to create the appropriate target motion. We evaluate our method on two examples of indirect control. In the first, we synthesize a walking human character that follows a sampled trajectory. In the second, we generate a synthetic partner for a dancer whose motion is acquired through motion capture.

A Note on Perturbation Results for Learning Empirical Operators

2008年8月19日 00:00:00 GMT

A Note on Perturbation Results for Learning Empirical Operators De Vito, Ernesto; Belkin, Mikhail; Rosasco, Lorenzo A large number of learning algorithms, for example, spectral clustering, kernel Principal Components Analysis and many manifold methods are based on estimating eigenvalues and eigenfunctions of operators defined by a similarity function or a kernel, given empirical data. Thus for the analysis of algorithms, it is an important problem to be able to assess the quality of such approximations. The contribution of our paper is two-fold: 1. We use a technique based on a concentration inequality for Hilbert spaces to provide new much simplified proofs for a number of results in spectral approximation. 2. Using these methods we provide several new results for estimating spectral properties of the graph Laplacian operator extending and strengthening results from [26].

Transductive Ranking on Graphs

2008年8月07日 00:00:00 GMT

Transductive Ranking on Graphs Agarwal, Shivani In ranking, one is given examples of order relationships among objects, and the goal is to learn from these examples a real-valued ranking function that induces a ranking or ordering over the object space. We consider the problem of learning such a ranking function in a transductive, graph-based setting, where the object space is finite and is represented as a graph in which vertices correspond to objects and edges encode similarities between objects. Building on recent developments in regularization theory for graphs and corresponding Laplacian-based learning methods, we develop an algorithmic framework for learning ranking functions on graphs. We derive generalization bounds for our algorithms in transductive models similar to those used to study other transductive learning problems, and give experimental evidence of the potential benefits of our framework.

Adaptive Envelope MDPs for Relational Equivalence-based Planning

2008年7月29日 00:00:00 GMT

Adaptive Envelope MDPs for Relational Equivalence-based Planning Gardiol, Natalia H.; Kaelbling, Leslie Pack We describe a method to use structured representations of the environmentâ€™s dynamics to constrain and speed up the planning process. Given a problem domain described in a probabilistic logical description language, we develop an anytime technique that incrementally improves on an initial, partial policy. This partial solution is found by ï¬rst reducing the number of predicates needed to represent a relaxed version of the problem to a minimum, and then dynamically partitioning the action space into a set of equivalence classes with respect to this minimal representation. Our approach uses the envelope MDP framework, which creates a Markov decision process out of a subset of the full state space as de- termined by the initial partial solution. This strategy permits an agent to begin acting within a restricted part of the full state space and to expand its envelope judiciously as resources permit.

Understanding camera trade-offs through a Bayesian analysis of light field projections - A revision

2008年7月28日 00:00:00 GMT

Understanding camera trade-offs through a Bayesian analysis of light field projections - A revision Levin, Anat; Freeman, William; Durand, Fredo Computer vision has traditionally focused on extracting structure,such as depth, from images acquired using thin-lens or pinholeoptics. The development of computational imaging is broadening thisscope; a variety of unconventional cameras do not directly capture atraditional image anymore, but instead require the jointreconstruction of structure and image information. For example, recentcoded aperture designs have been optimized to facilitate the jointreconstruction of depth and intensity. The breadth of imaging designs requires new tools to understand the tradeoffs implied bydifferent strategies. This paper introduces a unified framework for analyzing computational imaging approaches.Each sensor element is modeled as an inner product over the 4D light field.The imaging task is then posed as Bayesian inference: giventhe observed noisy light field projections and a new prior on light field signals, estimate the original light field. Under common imaging conditions, we compare theperformance of various camera designs using 2D light field simulations. Thisframework allows us to better understand the tradeoffs of each camera type and analyze their limitations.

Event Order Abstraction for Parametric Real-Time System Verification

2008年7月28日 00:00:00 GMT

Event Order Abstraction for Parametric Real-Time System Verification Umeno, Shinya We present a new abstraction technique, event order abstraction (EOA), for parametric safety verification of real-time systems in which ``correct orderings of events'' needed for system correctness are maintained by timing constraints on the systems' behavior. By using EOA, one can separate the task of verifying a real-time system into two parts: 1. Safety property verification of the system given that only correct event orderings occur; and 2. Derivation of timing parameter constraints for correct orderings of events in the system.The user first identifies a candidate set of bad event orders.Then, by using ordinary untimed model-checking, the user examines whether a discretized system model in which all timing constraints are abstracted away satisfies a desirable safety property under the assumption that the identified bad event orders occur in no system execution. The user uses counterexamples obtained from the model-checker to identify additional bad event orders, and repeats the process until the model-checking succeeds. In this step, the user obtains a sufficient set of bad event orders that must be excluded by timing synthesis for system correctness.Next, the algorithm presented in the paper automatically derives a set of timing parameter constraints under which the system does not exhibit the identified bad event orderings. From this step combined with the untimed model-checking step,the user obtains a sufficient set of timing parameter constraints under which the system executes correctly with respect to a given safety property.We illustrate the use of EOA with a train-gate example inspired by the general railroad crossing problem. We also summarize three other case studies, a biphase mark protocol, the IEEE 1394 root contention protocol, and the Fischer mutual exclusion algorithm.

An $\Omega(n \log n)$ Lower Bound on the Cost of Mutual Exclusion

2006年7月23日 00:00:00 GMT

An $\Omega(n \log n)$ Lower Bound on the Cost of Mutual Exclusion Fan, Rui; Lynch, Nancy We prove an $\Omega(n \log n)$ lower bound on the number ofnon-busywaiting memory accesses by any deterministic algorithm solving$n$ process mutual exclusion that communicates via shared registers.The cost of the algorithm is measured in the \emph{state change} costmodel, a variation of the cache coherent model. Our bound is tight inthis model. We introduce a novel information theoretic prooftechnique. We first establish a lower bound on the information neededby processes to solve mutual exclusion. Then we relate the amount ofinformation processes can acquire through shared memory accesses tothe cost they incur. We believe our proof technique is flexible andintuitive, and may be applied to a variety of other problems andsystem models.

Elastic-Net Regularization in Learning Theory

2008年7月24日 00:00:00 GMT

Elastic-Net Regularization in Learning Theory De Mol, Christine; Rosasco, Lorenzo; De Vito, Ernesto Within the framework of statistical learning theory we analyze in detail the so-called elastic-net regularization scheme proposed by Zou and Hastie ["Regularization and variable selection via the elastic net" J. R. Stat. Soc. Ser. B, 67(2):301-320, 2005] for the selection of groups of correlated variables. To investigate on the statistical properties of this scheme and in particular on its consistency properties, we set up a suitable mathematical framework. Our setting is random-design regression where we allow the response variable to be vector-valued and we consider prediction functions which are linear combination of elements (features) in an infinite-dimensional dictionary. Under the assumption that the regression function admits a sparse representation on the dictionary, we prove that there exists a particular "elastic-net representation" of the regression function such that, if the number of data increases, the elastic-net estimator is consistent not only for prediction but also for variable/feature selection. Our results include finite-sample bounds and an adaptive scheme to select the regularization parameter. Moreover, using convex analysis tools, we derive an iterative thresholding algorithm for computing the elastic-net solution which is different from the optimization procedure originally proposed in "Regularization and variable selection via the elastic net".

A Projected Subgradient Method for Scalable Multi-Task Learning

2008年7月23日 00:00:00 GMT

A Projected Subgradient Method for Scalable Multi-Task Learning Quattoni, Ariadna; Carreras, Xavier; Collins, Michael; Darrell, Trevor Recent approaches to multi-task learning have investigated the use of a variety of matrix norm regularization schemes for promoting feature sharing across tasks.In essence, these approaches aim at extending the l1 framework for sparse single task approximation to the multi-task setting. In this paper we focus on the computational complexity of training a jointly regularized model and propose an optimization algorithm whose complexity is linear with the number of training examples and O(n log n) with n being the number of parameters of the joint model. Our algorithm is based on setting jointly regularized loss minimization as a convex constrained optimization problem for which we develop an efficient projected gradient algorithm. The main contribution of this paper is the derivation of a gradient projection method with l1âˆ’âˆž constraints that can be performed efficiently and which has convergence rates.

Composable Probabilistic Inference with Blaise

2008年7月23日 00:00:00 GMT

Composable Probabilistic Inference with Blaise Bonawitz, Keith A Probabilistic inference provides a unified, systematic framework for specifying and solving these problems. Recent work has demonstrated the great value of probabilistic models defined over complex, structured domains. However, our ability to imagine probabilistic models has far outstripped our ability to programmatically manipulate them and to effectively implement inference, limiting the complexity of the problems that we can solve in practice.This thesis presents Blaise, a novel framework for composable probabilistic modeling and inference, designed to address these limitations. Blaise has three components: * The Blaise State-Density-Kernel (SDK) graphical modeling language that generalizes factor graphs by: (1) explicitly representing inference algorithms (and their locality) using a new type of graph node, (2) representing hierarchical composition and repeated substructures in the state space, the interest distribution, and the inference procedure, and (3) permitting the structure of the model to change during algorithm execution. * A suite of SDK graph transformations that may be used to extend a model (e.g. to construct a mixture model from a model of a mixture component), or to make inference more effective (e.g. by automatically constructing a parallel tempered version of an algorithm or by exploiting conjugacy in a model). * The Blaise Virtual Machine, a runtime environment that can efficiently execute the stochastic automata represented by Blaise SDK graphs. Blaise encourages the construction of sophisticated models by composing simpler models, allowing the designer to implement and verify small portions of the model and inference method, and to reuse model components from one task to another. Blaise decouples the implementation of the inference algorithm from the specification of the interest distribution, even in cases (such as Gibbs sampling) where the shape of the interest distribution guides the inference. This gives modelers the freedom to explore alternate models without slow, error-prone reimplementation. The compositional nature of Blaise enables novel reinterpretations of advanced Monte Carlo inference techniques (such as parallel tempering) as simple transformations of Blaise SDK graphs.In this thesis, I describe each of the components of the Blaise modeling framework, as well as validating the Blaise framework by highlighting a variety of contemporary sophisticated models that have been developed by the Blaise user community. I also present several surprising findings stemming from the Blaise modeling framework, including that an Infinite Relational Model can be built using exactly the same inference methods as a simple mixture model, that constructing a parallel tempered inference algorithm should be a point-and-click/one-line-of-code operation, and that Markov chain Monte Carlo for probabilistic models with complicated long-distance dependencies, such as a stochastic version of Scheme, can be managed using standard Blaise mechanisms.

A Distributed Building Evacuation System

2008年7月14日 00:00:00 GMT

A Distributed Building Evacuation System Qumsiyeh, Dany M. This thesis investigates the feasibility of a smart building evacuation system, capable of guiding occupants along safe paths to exits and responding to changing threats. Inspired by developments in amorphous computing, the design presented is scalable to large networks, robust to hardware and communication failure, and based on simple low-cost components. A simulation and hardware prototype demonstrate that this distributed building evacuation system is both feasible and cost effective.

Knowledge Benchmarks in Adversarial Mechanism Design (Part I) and Implementation in Surviving Strategies (Part I)

2008年7月01日 00:00:00 GMT

Knowledge Benchmarks in Adversarial Mechanism Design (Part I) and Implementation in Surviving Strategies (Part I) Chen, Jing; Micali, Silvio We put forward new benchmarks and solution concepts for Adversarial Mechanism Design, as defined by [MV07.a], and we exemplify them in the case of truly combinatorial auctions.We benchmark the combined performance (the sum of the auction's effciency and revenue)of a truly combinatorial auction against a very relevant but private knowledge of the players: essentially, the maximum revenue that the best informed player could guarantee if he were the seller. (I.e., by offering each other player a subset of the goods for a take-it-or-leave-it price.) We achieve this natural benchmark within a factor of 2, by means of a new and probabilisticauction mechanism, in KNOWLINGLY SURVIVING STRATEGIES. That is, the above performance of our mechanism is guaranteed in any rational play, independent of any possible beliefs of the players. Indeed, our performance guarantee holds for any possible choice of strategies, so long as each player chooses a strategy among those surviving iterated elimination of knowingly dominated strategies.Our mechanism is extremely robust. Namely, its performance guarantees hold even if all but one of the players collude (together or in separate groups) in any possible but reasonable way. Essentially, the only restriction for the collective utility function of a collusive subset S of the players is the following: the collective utility increases when one member of S is allocated asubset of the goods "individually better" for him and/or his "individual price" is smaller, while the allocations and prices of all other members of S stay the same.Our results improve on the yet unpublished ones of [MV07.b]. The second part of this paper, dealing with a more aggressive benchmark (essentially, the maximum welfare privately known to the players) is forthcoming.

Knowledge Benchmarks in Adversarial Mechanism Design and Implementation in Surviving Strategies (Part I)

2008年6月01日 00:00:00 GMT

Knowledge Benchmarks in Adversarial Mechanism Design and Implementation in Surviving Strategies (Part I) Chen, Jing; Micali, Silvio We put forward new benchmarks and solution concepts for Adversarial Mechanism Design, as defined by [MV07.a], and we exemplify them in the case of truly combinatorial auctions.We benchmark the combined performance (the sum of the auction's efficiency and revenue) of a truly combinatorial auction against a very relevant but private knowledge of the players: essentially, the maximum revenue that the best informed player could guarantee if he were the seller. (I.e., by offering each other player a subset of the goods for a take-it-or-leave-it price.)We achieve this natural benchmark within a factor of 2, by means of a new and probabilistic auction mechanism, in surviving strategies. That is, the above performance of our mechanism is guaranteed in any rational play, independent of any possible beliefs of the players. Indeed, our performance guarantee holds for any possible choice of strategies, so long as each player chooses a strategy among those surviving iterated elimination of dominated strategies.Our mechanism is extremely robust. Namely, its performance guarantees hold even if all but one of the players collude (together or in separate groups) in any possible but reasonable way. Essentially, the only restriction for the collective utility function of a collusive subset S of the players is the following: the collective utility increases when one member of S is allocated a ubset of the goods "individually better" for him and/or his "individual price" is smaller, while the allocations and prices of all other members of S stay the same.Our results improve on the yet unpublished ones of [MV07.b]. The second part of this paper, dealing with a more aggressive benchmark (essentially, the maximum welfare privately known to the players) is forthcoming.

Leveraging Player Knowledge in Combinatorial Auctions (and Implementation in Surviving Strategies)

2008年6月17日 00:00:00 GMT

Leveraging Player Knowledge in Combinatorial Auctions (and Implementation in Surviving Strategies) Chen, Jing; Micali, Silvio None

Flexible MIPS Soft Processor Architecture

2008年6月16日 00:00:00 GMT

Flexible MIPS Soft Processor Architecture Carli, Roberto The flexible MIPS soft processor architecture borrows selected technologies from high-performance computing to deliver a modular, highly customizable CPU targeted towards FPGA implementations for embedded systems; the objective is to provide a more flexible architectural alternative to coprocessor-based solutions. The processor performs out-of-order execution on parallel functional units, it delivers in-order instruction commit and it is compatible with the MIPS-1 Instruction Set Architecture. Amongst many available options, the user can introduce custom instructions and matching functional units; modify existing units; change the pipelining depth within functional units to any fixed or variable value; customize instruction definitions in terms of operands, control signals and register file interaction; insert multiple redundant functional units for improved performance. The flexibility provided by the architecture allows the user to expand the processor functionality to implement instructions of coprocessor-level complexity through additional functional units. The processor design was implemented and simulated on two FPGA platforms, tested on multiple applications, and compared to three commercially available soft processor solutions in terms of features, area, clock frequency and benchmark performance.

Detecting and Tolerating Byzantine Faults in Database Systems

2008年6月30日 00:00:00 GMT

Detecting and Tolerating Byzantine Faults in Database Systems Vandiver, Benjamin Mead This thesis describes the design, implementation, and evaluation of a replication scheme to handle Byzantine faults in transaction processing database systems. The scheme compares answers from queries and updates on multiple replicas which are off-the-shelf database systems, to provide a single database that is Byzantine fault tolerant. The scheme works when the replicas are homogeneous, but it also allows heterogeneous replication in which replicas come from different vendors. Heterogeneous replicas reduce the impact of bugs and security compromises because they are implemented independently and are thus less likely to suffer correlated failures. A final component of the scheme is a repair mechanism that can correct the state of a faulty replica, ensuring the longevity of the scheme.The main challenge in designing a replication scheme for transaction processingsystems is ensuring that the replicas state does not diverge while allowing a high degree of concurrency. We have developed two novel concurrency control protocols, commit barrier scheduling (CBS) and snapshot epoch scheduling (SES) that provide strong consistency and good performance. The two protocols provide different types of consistency: CBS provides single-copy serializability and SES provides single-copy snapshot isolation. We have implemented both protocols in the context of a replicated SQL database. Our implementation has been tested with production versions of several commercial and open source databases as replicas. Our experiments show a configuration that can tolerate one faulty replica has only a modest performance overhead (about 10-20% for the TPC-C benchmark). Our implementation successfully masks several Byzantine faults observed in practice and we have used it to find a new bug in MySQL.

Revenue in Truly Combinatorial Auctions and Adversarial Mechanism Design

2007年11月02日 00:00:00 GMT

Revenue in Truly Combinatorial Auctions and Adversarial Mechanism Design Micali, Silvio; Valiant, Paul Little is known about generating revenue in UNRESTRICTED combinatorial auctions. (In particular, the VCG mechanism has no revenue guarantees.) In this paper we determine how much revenue can be guaranteed in such auctions. Our analysis holds both in the standard model, when all players are independent and rational, as well as in a most adversarial model, where some players may bid collusively or even totally irrationally.

Safe Open-Nested Transactions Through Ownership

2008年2月20日 00:00:00 GMT

Safe Open-Nested Transactions Through Ownership Agrawal, Kunal; Lee, I-Ting Angelina; Sukha, Jim Researchers in transactional memory (TM) have proposed open nesting asa methodology for increasing the concurrency of a program. The ideais to ignore certain "low-level" memory operations of anopen-nested transaction when detecting conflicts for its parenttransaction, and instead perform abstract concurrency control for the"high-level" operation that nested transaction represents. Tosupport this methodology, TM systems use an open-nested commitmechanism that commits all changes performed by an open-nestedtransaction directly to memory, thereby avoiding low-levelconflicts. Unfortunately, because the TM runtime is unaware of thedifferent levels of memory, an unconstrained use of open-nestedcommits can lead to anomalous program behavior.In this paper, we describe a framework of ownership-awaretransactional memory which incorporates the notion of modules into theTM system and requires that transactions and data be associated withspecific transactional modules or Xmodules. We propose a newownership-aware commit mechanism, a hybrid between anopen-nested and closed-nested commit which commits a piece of datadifferently depending on whether the current Xmodule owns the data ornot. Moreover, we give a set of precise constraints on interactionsand sharing of data among the Xmodules based on familiar notions ofabstraction. We prove that ownership-aware TM has has cleanmemory-level semantics and can guarantee serializability bymodules, which is an adaptation of multilevel serializability fromdatabases to TM. In addition, we describe how a programmer canspecify Xmodules and ownership in a Java-like language. Our typesystem can enforce most of the constraints required by ownership-awareTM statically, and can enforce the remaining constraints dynamically.Finally, we prove that if transactions in the process of aborting obeyrestrictions on their memory footprint, the OAT model is free fromsemantic deadlock.

Matching Sets of Features for Efficient Retrieval and Recognition

2006年8月11日 00:00:00 GMT

Matching Sets of Features for Efficient Retrieval and Recognition Grauman, Kristen Lorraine In numerous domains it is useful to represent a single example by the collection of local features or parts that comprise it. In computer vision in particular, local image features are a powerful way to describe images of objects and scenes. Their stability under variable image conditions is critical for success in a wide range of recognition and retrieval applications. However, many conventional similarity measures and machine learning algorithms assume vector inputs. Comparing and learning from images represented by sets of local features is therefore challenging, since each set may vary in cardinality and its elements lack a meaningful ordering. In this thesis I present computationally efficient techniques to handle comparisons, learning, and indexing with examples represented by sets of features. The primary goal of this research is to design and demonstrate algorithms that can effectively accommodate this useful representation in a way that scales with both the representation size as well as the number of images available for indexing or learning.I introduce the pyramid match algorithm, which efficiently forms an implicit partial matching between two sets of feature vectors. The matching has a linear time complexity, naturally forms a Mercer kernel, and is robust to clutter or outlier features, a critical advantage for handling images with variable backgrounds, occlusions, and viewpoint changes. I provide bounds on the expected error relative to the optimal partial matching. For very large databases, even extremely efficient pairwise comparisons may not offer adequately responsive query times. I show how to perform sub-linear time retrievals under the matching measure with randomized hashing techniques, even when input sets have varying numbers of features.My results are focused on several important vision tasks, including applications to content-based image retrieval, discriminative classification for object recognition, kernel regression, and unsupervised learning of categories. I show how the dramatic increase in performance enables accurate and flexible image comparisons to be made on large-scale data sets, and removes the need to artificially limit the number of local descriptions used per image when learning visual categories.

Fast concurrent object classification and localization

2008年6月10日 00:00:00 GMT

Fast concurrent object classification and localization Yeh, Tom; Lee, John J.; Darrell, Trevor Object localization and classification are important problems incomputer vision. However, in many applications, exhaustive searchover all class labels and image locations is computationallyprohibitive. While several methods have been proposed to makeeither classification or localization more efficient, few havedealt with both tasks simultaneously. This paper proposes anefficient method for concurrent object localization andclassification based on a data-dependent multi-classbranch-and-bound formalism. Existing bag-of-featuresclassification schemes, which can be expressed as weightedcombinations of feature counts can be readily adapted to ourmethod. We present experimental results that demonstrate the meritof our algorithm in terms of classification accuracy, localizationaccuracy, and speed, compared to baseline approaches includingexhaustive search, the ISM method, and single-class branch andbound.

Agent Organization in the Knowledge Plane

2008年6月11日 00:00:00 GMT

Agent Organization in the Knowledge Plane Li, Ji In designing and building a network like the Internet, we continue to face the problems of scale and distribution. With the dramatic expansion in scale and heterogeneity of the Internet, network management has become an increasingly difficult task. Furthermore, network applications often need to maintain efficient organization among the participants by collecting information from the underlying networks. Such individual information collection activities lead to duplicate efforts and contention for network resources.The Knowledge Plane (KP) is a new common construct that provides knowledge and expertise to meet the functional, policy and scaling requirements of network management, as well as to create synergy and exploit commonality among many network applications. To achieve these goals, we face many challenging problems, including widely distributed data collection, efficient processing of that data, wide availability of the expertise, etc.In this thesis, to provide better support for network management and large-scale network applications, I propose a knowledge plane architecture that consists of a network knowledge plane (NetKP) at the network layer, and on top of it, multiple specialized KPs (spec-KPs). The NetKP organizes agents to provide valuable knowledge and facilities about the Internet to the spec-KPs. Each spec-KP is specialized in its own area of interest. In both the NetKP and the spec-KPs, agents are organized into regions based on different sets of constraints. I focus on two key design issues in the NetKP: (1) a regionbased architecture for agent organization, in which I design an efficient and non-intrusive organization among regions that combines network topology and a distributed hash table; (2) request and knowledge dissemination, in which I design a robust and efficient broadcast and aggregation mechanism using a tree structure among regions. In the spec-KPs, I build two examples: experiment management on the PlanetLab testbed and distributed intrusion detection on the DETER testbed. The experiment results suggest a common approach driven by the design principles of the Internet and more specialized constraints can derive productive organization for network management and applications.

Non-Metrical Navigation Through Visual Path Control

2008年6月06日 00:00:00 GMT

Non-Metrical Navigation Through Visual Path Control Huang, Albert S.; Teller, Seth We describe a new method for wide-area, non-metrical robot navigationwhich enables useful, purposeful motion indoors. Our method has twophases: a training phase, in which a human user directs a wheeledrobot with an attached camera through an environment while occasionallysupplying textual place names; and a navigation phase in which theuser specifies goal place names (again as text), and the robot issueslow-level motion control in order to move to the specified place. We show thatdifferences in the visual-field locations and scales of features matched acrosstraining and navigation can be used to construct a simple and robust controlrule that guides the robot onto and along the training motion path.Our method uses an omnidirectional camera, requires approximateintrinsic and extrinsic camera calibration, and is capable of effective motioncontrol within an extended, minimally-prepared building environment floorplan.We give results for deployment within a single building floor with 7 rooms, 6corridor segments, and 15 distinct place names.

On a model of visual cortex: learning invariance and selectivity

2008年4月04日 00:00:00 GMT

On a model of visual cortex: learning invariance and selectivity Caponnetto, Andrea; Poggio, Tomaso; Smale, Steve In this paper we present a class of algorithms for similarity learning on spaces of images. The general framework that we introduce is motivated by some well-known hierarchical pre-processing architectures for object recognition which have been developed during the last decade, and which have been in some cases inspired by functional models of the ventral stream of the visual cortex. These architectures are characterized by the construction of a hierarchy of â€œlocalâ€ feature representations of the visual stimulus. We show that our framework includes some well-known techniques, and that it is suitable for the analysis of dynamic visual stimuli, presenting a quantitative error analysis in this setting.

The SoftPHY Abstraction: from Packets to Symbols in Wireless Network Design

2008年6月03日 00:00:00 GMT

The SoftPHY Abstraction: from Packets to Symbols in Wireless Network Design Jamieson, Kyle At ever-increasing rates, we are using wireless systems to communicatewith others and retrieve content of interest to us. Current wirelesstechnologies such as WiFi or Zigbee use forward error correction todrive bit error rates down when there are few interferingtransmissions. However, as more of us use wireless networks toretrieve increasingly rich content, interference increases inunpredictable ways. This results in errored bits, degradedthroughput, and eventually, an unusable network. We observe that thisis the result of higher layers working at the packet granularity,whereas they would benefit from a shift in perspective from wholepackets to individual symbols.From real-world experiments on a 31-node testbed of Zigbee andsoftware-defined radios, we find that often, not all of the bitsin corrupted packets share fate. Thus, today's wireless protocolsretransmit packets where only a small number of the constituent bitsin a packet are in error, wasting network resources. In thisdissertation, we will describe a physical layer that passesinformation about its confidence in each decoded symbol up to higherlayers. These SoftPHY hints have many applications, one ofwhich, more efficient link-layer retransmissions, we will describe indetail. PP-ARQ is a link-layer reliable retransmission protocolthat allows a receiver to compactly encode a request forretransmission of only the bits in a packet that are likely in error.Our experimental results show that PP-ARQ increases aggregate networkthroughput by a factor of approximately 2x under variousconditions. Finally, we will place our contributions in the contextof related work and discuss other uses of SoftPHY throughout thewireless networking stack.

Ignorable Information in Multi-Agent Scenarios

2008年5月12日 00:00:00 GMT

Ignorable Information in Multi-Agent Scenarios Milch, Brian; Koller, Daphne In some multi-agent scenarios, identifying observations that an agent can safely ignore reduces exponentially the size of the agent's strategy space and hence the time required to find a Nash equilibrium. We consider games represented using the multi-agent influence diagram (MAID) framework of Koller and Milch [2001], and analyze the extent to which information edges can be eliminated. We define a notion of a safe edge removal transformation, where all equilibria in the reduced model are also equilibria in the original model. We show that existing edge removal algorithms for influence diagrams are safe, but limited, in that they do not detect certain cases where edges can be removed safely. We describe an algorithm that produces the "minimal" safe reduction, which removes as many edges as possible while still preserving safety. Finally, we note that both the existing edge removal algorithms and our new one can eliminate equilibria where agents coordinate their actions by conditioning on irrelevant information. Surprisingly, in some games these "lost" equilibria can be preferred by all agents in the game.

Perfect Implementation of Normal-Form Mechanisms

2007年3月01日 00:00:00 GMT

Perfect Implementation of Normal-Form Mechanisms Izmalkov, Sergei; Lepinski, Matt; Micali, Silvio Privacy and trust affect our strategic thinking, yet they have not been precisely modeled in mechanism design. In settings of incomplete information, traditional implementations of a normal-form mechanism ---by disregarding the players' privacy, or assuming trust in a mediator--- may not be realistic and fail to reach the mechanism's objectives. We thus investigate implementations of a new type.We put forward the notion of a perfect implementation of a normal-form mechanism M: in essence, an extensive-form mechanism exactly preserving all strategic properties of M, WITHOUT relying on a trusted mediator or violating the privacy of the players. We prove that ANY normal-form mechanism can be perfectly implemented by a PUBLIC mediator using envelopes and an envelope-randomizing device (i.e., the same tools used for running fair lotteries or tallying secret votes). Differently from a trusted mediator, a public one only performs prescribed public actions, so that everyone can verify that he is acting properly, and never learns any information that should remain private.

Gesture in Automatic Discourse Processing

2008年5月07日 00:00:00 GMT

Gesture in Automatic Discourse Processing Eisenstein, Jacob Computers cannot fully understand spoken language without access to the wide range of modalities that accompany speech. This thesis addresses the particularly expressive modality of hand gesture, and focuses on building structured statistical models at the intersection of speech, vision, and meaning.My approach is distinguished in two key respects. First, gestural patterns are leveraged to discover parallel structures in the meaning of the associated speech. This differs from prior work that attempted to interpret individual gestures directly, an approach that was prone to a lack of generality across speakers. Second, I present novel, structured statistical models for multimodal language processing, which enable learning about gesture in its linguistic context, rather than in the abstract.These ideas find successful application in a variety of language processing tasks: resolving ambiguous noun phrases, segmenting speech into topics, and producing keyframe summaries of spoken language. In all three cases, the addition of gestural features -- extracted automatically from video -- yields significantly improved performance over a state-of-the-art text-only alternative. This marks the first demonstration that hand gesture improves automatic discourse processing.

Efficient Object Recognition and Image Retrieval for Large-Scale Applications

2008年5月06日 00:00:00 GMT

Efficient Object Recognition and Image Retrieval for Large-Scale Applications Lee, John J. Algorithms for recognition and retrieval tasks generally call for both speed and accuracy. When scaling up to very large applications, however, we encounter additional significant requirements: adaptability and scalability. In many real-world systems, large numbers of images are constantly added to the database, requiring the algorithm to quickly tune itself to recent trends so it can serve queries more effectively. Moreover, the systems need to be able to meet the demands of simultaneous queries from many users. In this thesis, I describe two new algorithms intended to meet these requirements and give an extensive experimental evaluation for both. The first algorithm constructs an adaptive vocabulary forest, which is an efficient image-database model that grows and shrinks as needed while adapting its structure to tune itself to recent trends. The second algorithm is a method for efficiently performing classification tasks by comparing query images to only afixed number of training examples, regardless of the size of the image database. These two methods can be combined to create a fast, adaptable, and scalable vision system suitable for large-scale applications. I also introduce LIBPMK, a fast implementation of common computer vision processing pipelines such as that of the pyramid match kernel. This implementation was used to build several successful interactive applications as well as batch experiments for research settings. This implementation, in addition to the two new algorithms introduced by this thesis, are a step toward meeting the speed, adaptability, and scalability requirements of practical large-scale vision systems.

New-Age Cryptography

2008年4月16日 00:00:00 GMT

New-Age Cryptography Pass, Rafael; Vaikuntanathan, Vinod We introduce new and general complexity theoretic hardness assumptions. These assumptions abstract out concrete properties of a random oracle and are significantly stronger than traditional cryptographic hardness assumptions; however, assuming their validity we can resolve a number of longstandingopen problems in cryptography.

Transferring Nonlinear Representations using Gaussian Processes with a Shared Latent Space

2008年4月11日 00:00:00 GMT

Transferring Nonlinear Representations using Gaussian Processes with a Shared Latent Space Urtasun, Raquel; Quattoni, Ariadna; Lawrence, Neil; Darrell, Trevor When a series of problems are related, representations derived from learning earlier tasks may be useful in solving later problems. In this paper we propose a novel approach to transfer learning with low-dimensional, non-linear latent spaces. We show how such representations can be jointly learned across multiple tasks in a Gaussian Process framework. When transferred to new tasks with relatively few training examples, learning can be faster and/or more accurate. Experiments on digit recognition and newsgroup classification tasks show significantly improved performance when compared to baseline performance with a representation derived from a semi-supervised learning approach or with a discriminative approach that uses only the target data.

Random-World Semantics and Syntactic Independence for Expressive Languages

2008年5月03日 00:00:00 GMT

Random-World Semantics and Syntactic Independence for Expressive Languages McAllester, David; Milch, Brian; Goodman, Noah D. We consider three desiderata for a language combining logic and probability: logical expressivity, random-world semantics, and the existence of a useful syntactic condition for probabilistic independence. Achieving these three desiderata simultaneously is nontrivial. Expressivity can be achieved by using a formalism similar to a programming language, but standard approaches to combining programming languages with probabilities sacrifice random-world semantics. Naive approaches to restoring random-world semantics undermine syntactic independence criteria. Our main result is a syntactic independence criterion that holds for a broad class of highly expressive logics under random-world semantics. We explore various examples including Bayesian networks, probabilistic context-free grammars, and an example from Mendelian genetics. Our independence criterion supports a case-factor inference technique that reproduces both variable elimination for BNs and the inside algorithm for PCFGs.

Generalization of the MV Mechanism

2008年5月01日 00:00:00 GMT

Generalization of the MV Mechanism Chen, Jing Micali and Valiant proposed a mechanism for combinatorial auctions that is dominant-strategy truthful, guarantees reasonably high revenue, and is very resilient against collusions. Their mechanism, however, uses as a subroutine the VCG mechanism, that is not polynomial time.We propose a modification of their mechanism that is efficient, while retaining their collusion resilience and a good fraction of their revenue, if given as a subroutine an efficient approximation of the VCG mechanism.

Block Heavy Hitters

2008年5月02日 00:00:00 GMT

Block Heavy Hitters Andoni, Alexandr; Ba, Khanh Do; Indyk, Piotr e study a natural generalization of the heavy hitters problem in thestreaming context. We term this generalization *block heavy hitters* and define it as follows. We are to stream over a matrix$A,ドル and report all *rows* that are heavy, where a row is heavy ifits ell_1-norm is at least phi fraction of the ell_1 norm ofthe entire matrix $A$. In comparison, in the standard heavy hittersproblem, we are required to report the matrix *entries* that areheavy. As is common in streaming, we solve the problem approximately:we return all rows with weight at least phi, but also possibly someother rows that have weight no less than (1-eps)phi. To solve theblock heavy hitters problem, we show how to construct a linear sketchof A from which we can recover the heavy rows of A.The block heavy hitters problem has already found applications forother streaming problems. In particular, it is a crucial buildingblock in a streaming algorithm that constructs asmall-size sketch for the Ulam metric, a metric on non-repetitivestrings under the edit (Levenshtein) distance.

Understanding camera trade-offs through a Bayesian analysis of light field projections

2008年4月16日 00:00:00 GMT

Understanding camera trade-offs through a Bayesian analysis of light field projections Levin, Anat; Freeman, William T.; Durand, Fredo Computer vision has traditionally focused on extracting structure,such as depth, from images acquired using thin-lens or pinhole optics. The development of computational imaging is broadening this scope; a variety of unconventional cameras do not directly capture a traditional image anymore, but instead require the joint reconstruction of structure and image information. For example, recent coded aperture designs have been optimized to facilitate the joint reconstruction of depth and intensity. The breadth of imaging designs requires new tools to understand the tradeoffs implied by different strategies.This paper introduces a unified framework for analyzing computational imagingapproaches. Each sensor element is modeled as an inner product over the 4D light field. The imaging task is then posed as Bayesian inference: given the observed noisy light field projections and a new prior on light field signals, estimatethe original light field. Under common imaging conditions, we compare the performance of various camera designs using 2D light field simulations. This framework allows us to better understand the tradeoffs of each camera type andanalyze their limitations.

A Multi-Scale Generalization of the HoG and HMAX Image Descriptors for Object Detection

2008年4月09日 00:00:00 GMT

A Multi-Scale Generalization of the HoG and HMAX Image Descriptors for Object Detection Bileschi, Stanley M Recently, several powerful image features have been proposed whichcan be described as spatial histograms of oriented energy. Forinstance, the HoG, HMAX C1, SIFT, and Shape Context feature allrepresent an input image using with a discrete set of bins whichaccumulate evidence for oriented structures over a spatial regionand a range of orientations. In this work, we generalize thesetechniques to allow for a foveated input image, rather than arectilinear raster. It will be shown that improved object detectionaccuracy can be achieved via inputting a spectrum of imagemeasurements, from sharp, fine-scale image sampling within a smallspatial region within the target to coarse-scale sampling of a widefield of view around the target. Several alternative featuregeneration algorithms are proposed and tested which suitably makeuse of foveated image inputs. In the experiments we show thatfeatures generated from the foveated input format produce detectorsof greater accuracy, as measured for four object types from commonlyavailable data-sets. Finally, a flexible algorithm for generatingfeatures is described and tested which is independent of inputtopology and uses ICA to learn appropriate filters.

ZigZag Decoding: Combating Hidden Terminals in Wireless Networks

2008年4月08日 00:00:00 GMT

ZigZag Decoding: Combating Hidden Terminals in Wireless Networks Katabi, Dina; Gollakota, Shyamnath This paper presents ZigZag, an 802.11 receiver that combats hidden terminals. ZigZag exploits 802.11 retransmissions which, in the case of hidden terminals, cause successive collisions. Due to asynchrony, these collisions have different interference-free stretches at their start, which ZigZag uses to bootstrap its decoding. ZigZag makes no changes to the 802.11 MAC and introduces no overhead when there are no collisions. But, when senders collide, ZigZag attains the same throughput as if the colliding packets were a priori scheduled in separate time slots. We build a prototype of ZigZag in GNU Radio. In a testbed of 14 USRP nodes, ZigZag reduces the average packet loss rate at hidden terminals from 82.3% to about 0.7%.

LIBPMK: A Pyramid Match Toolkit

2008年4月07日 00:00:00 GMT

LIBPMK: A Pyramid Match Toolkit Lee, John J. LIBPMK is a C++ implementation of Grauman and Darrell's pyramid match algorithm. This toolkit provides a flexible framework with which developers can quickly match sets of image features and run experiments. LIBPMK provides functionality for $k$-means and hierarchical clustering, dealing with data sets too large to fit in memory, building multi-resolution histograms, quickly performing pyramid matches, and training and testing support vector machines (SVMs). This report provides a tutorial on how to use the LIBPMK code, and gives the specifications of the LIBPMK API.

Cognitive Security for Personal Devices

2008年3月17日 00:00:00 GMT

Cognitive Security for Personal Devices Greenstadt, Rachel; Beal, Jacob Humans should be able to think of computers as extensions of their body, as craftsmen do with their tools. Current security models, however, are too unlike those used in human minds---for example, computers authenticate users by challenging them to repeat a secret rather than by continually observing the many subtle cues offered by their appearance and behavior. We propose three lines of research that can be combined to produce cognitive security on computers and other personal devices: imprinting and continuously deployed multi-modal biometrics, self-protection through virtualization and trusted computing, and adjustably autonomous security.

Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model

2008年6月24日 00:00:00 GMT

Trajectory Analysis and Semantic Region Modeling Using A Nonparametric Bayesian Model Grimson, Eric; Wang, Xiaogang; Ng, Gee-Wah; Ma, Keng Teck We propose a novel nonparametric Bayesian model, Dual Hierarchical Dirichlet Processes (Dual-HDP), for trajectory analysis and semantic region modeling in surveillance settings, in an unsupervised way. In our approach, trajectories are treated as documents and observations of an object on a trajectory are treated as words in a document. Trajectories are clustered into different activities. Abnormal trajectories are detected as samples with low likelihoods. The semantic regions, which are intersections of paths commonly taken by objects, related to activities in the scene are also modeled. Dual-HDP advances the existing Hierarchical Dirichlet Processes (HDP) language model. HDP only clusters co-occurring words from documents into topics and automatically decides the number of topics. Dual-HDP co-clusters both words and documents. It learns both the numbers of word topics and document clusters from data. Under our problem settings, HDP only clusters observations of objects, while Dual-HDP clusters both observations and trajectories. Experiments are evaluated on two data sets, radar tracks collected from a maritime port and visual tracks collected from a parking lot.

Two-stage Optimization Approach to Robust Model Predictive Control with a Joint Chance Constraint

2008年3月06日 00:00:00 GMT

Two-stage Optimization Approach to Robust Model Predictive Control with a Joint Chance Constraint Ono, Masahiro; Williams, Brian C. When controlling dynamic systems such as mobile robots in uncertain environments, there is a trade off between risk and reward. For example, a race car can turn a corner faster by taking a more challenging path. This paper proposes a new approach to planning a control sequence with guaranteed risk bound. Given a stochastic dynamic model, the problem is to find a control sequence that optimizes a performance metric, while satisfying chance constraints i.e. constraints on the upper bound of the probability of failure. We propose a two-stage optimization approach, with the upper stage optimizing the risk allocation and the lower stage calculating the optimal control sequence that maximizes the reward. In general, upper-stage is a non-convex optimization problem, which is hard to solve. We develop a new iterative algorithm for this stage that efficiently computes the risk allocation with a small penalty to optimality. The algorithm is implemented and tested on the autonomous underwater vehicle (AUV) depth planning problem, which demonstrates the substantial improvement in computation cost and suboptimality compared to the prior arts.

Efficient Motion Planning Algorithm for Stochastic Dynamic Systems with Constraints on Probability of Failure

2008年3月06日 00:00:00 GMT

Efficient Motion Planning Algorithm for Stochastic Dynamic Systems with Constraints on Probability of Failure Ono, Masahiro; Williams, Brian C. When controlling dynamic systems such as mobile robots in uncertain environments, there is a trade off between risk and reward. For example, a race car can turn a corner faster by taking a more challenging path. This paper proposes a new approach to planning a control sequence with guaranteed risk bound. Given a stochastic dynamic model, the problem is to find a control sequence that optimizes a performance metric, while satisfying chance constraints i.e. constraints on the upper bound of the probability of failure. We propose a two-stage optimization approach, with the upper stage optimizing the risk allocation and the lower stage calculating the optimal control sequence that maximizes the reward. In general, upper-stage is a non-convex optimization problem, which is hard to solve. We develop a new iterative algorithm for this stage that efficiently computes the risk allocation with a small penalty to optimality. The algorithm is implemented and tested on the autonomous underwater vehicle (AUV) depth planning problem, which demonstrates the substantial improvement in computation cost and suboptimality compared to the prior arts.

Transfer learning for image classification with sparse prototype representations

2008年3月03日 00:00:00 GMT

Transfer learning for image classification with sparse prototype representations Quattoni, Ariadna; Collins, Michael; Darrell, Trevor To learn a new visual category from few examples, prior knowledge from unlabeled data as well as previous related categories may be useful. We develop a new method for transfer learning which exploits available unlabeled data and an arbitrary kernel function; we form a representation based on kernel distances to a large set of unlabeled data points. To transfer knowledge from previous related problems we observe that a category might be learnable using only a small subset of reference prototypes. Related problems may share a significant number of relevant prototypes; we find such a reduced representation by performing a joint loss minimization over the training sets of related problems with a shared regularization penalty that minimizes the total number of prototypes involved in the approximation.This optimization problem can be formulated as a linear program thatcan be solved efficiently. We conduct experiments on a news-topic prediction task where the goal is to predict whether an image belongs to a particularnews topic. Our results show that when only few examples are available for training a target topic, leveraging knowledge learnt from other topics can significantly improve performance.

Learning Grammatical Models for Object Recognition

2008年2月25日 00:00:00 GMT

Learning Grammatical Models for Object Recognition Aycinena, Meg; Kaelbling, Leslie Pack; Lozano-Perez, Tomas Many object recognition systems are limited by their inability to share common parts or structure among related object classes. This capability is desirable because it allows information about parts and relationships in one object class to be generalized to other classes for which it is relevant. With this goal in mind, we have designed a representation and recognition framework that captures structural variability and shared part structure within and among object classes. The framework uses probabilistic geometric grammars (PGGs) to represent object classes recursively in terms of their parts, thereby exploiting the hierarchical and substitutive structure inherent to many types of objects. To incorporate geometric and appearance information, we extend traditional probabilistic context-free grammars to represent distributions over the relative geometric characteristics of object parts as well as the appearance of primitive parts. We describe an efficient dynamic programming algorithm for object categorization and localization in images given a PGG model. We also develop an EM algorithm to estimate the parameters of a grammar structure from training data, and a search-based structure learning approach that finds a compact grammar to explain the image data while sharing substructure among classes. Finally, we describe a set of experiments that demonstrate empirically that the system provides a performance benefit.

Exploiting Transport-Level Characteristics of Spam

2008年2月15日 00:00:00 GMT

Exploiting Transport-Level Characteristics of Spam Beverly, Robert; Sollins, Karen In the arms race to secure electronic mail users and servers fromunsolicited messages (spam), the most successful solutions employtechniques that are difficult for spammers to circumvent. Thisresearch investigates the transport-layer characteristics ofemail in order to provide a new, novel and robust defense againstspam. We find that spam SMTP flows exhibit TCP behavior consistentwith traffic competing for link access, large round trip times andresource constrained hosts. Thus, SMTP flow characteristics providesufficient statistical power to differentiate between spam andlegitimate mail (ham). We build "SpamFlow" to learn and exploitthese differences. Using machine learning feature selection weidentify the most discriminatory flow properties and effect greaterthan 90% spam classification accuracy without content or reputationanalysis. SpamFlow correctly identifies 78% of the false negativesgenerated by a popular content filtering application -- demonstratingthe power in combining SpamFlow with existing techniques. Finally, weargue that SpamFlow is not easily subvertible due to economicand practical constraints inherent in sourcing spam.

Unsupervised Distributed Feature Selection for Multi-view Object Recognition

2008年2月17日 00:00:00 GMT

Unsupervised Distributed Feature Selection for Multi-view Object Recognition Christoudias, C. Mario; Urtasun, Raquel; Darrell, Trevor Object recognition accuracy can be improved when information frommultiple views is integrated, but information in each view can oftenbe highly redundant. We consider the problem of distributed objectrecognition or indexing from multiple cameras, where thecomputational power available at each camera sensor is limited andcommunication between sensors is prohibitively expensive. In thisscenario, it is desirable to avoid sending redundant visual featuresfrom multiple views, but traditional supervised feature selectionapproaches are inapplicable as the class label is unknown at thecamera. In this paper we propose an unsupervised multi-view featureselection algorithm based on a distributed compression approach.With our method, a Gaussian Process model of the joint viewstatistics is used at the receiver to obtain a joint encoding of theviews without directly sharing information across encoders. Wedemonstrate our approach on recognition and indexing tasks withmulti-view image databases and show that our method comparesfavorably to an independent encoding of the features from eachcamera.

Making Medical Records More Resilient

2008年2月17日 00:00:00 GMT

Making Medical Records More Resilient Rudin, Robert Hurricane Katrina showed that the current methods for handling medicalrecords are minimally resilient to large scale disasters. This research presents a preliminary model for measuring the resilience of medical records systemsagainst public policy goals and uses the model to illuminate the current state of medical record resilience. From this analysis, three recommendations for how to make medical records more resilient are presented.The recommendations are: 1) Federal and state governments should use the preliminary resiliencemodel introduced here as the basis for compliance requirements for electronicmedical record technical architectures. 2) Regional Health Information Organizations (RHIOs) should consideroffering services in disaster management to healthcare organizations. This willhelp RHIOs create sustainable business models. 3) Storage companies should consider developing distributed storagesolutions based on Distributed Hash Table (DHT) technology for medical recordstorage. Distributed storage would alleviate public concerns over privacy withcentralized storage of medical records. Empirical evidence is presenteddemonstrating the performance of DHT technology using a prototype medicalrecord system.

Wicked Problems and Gnarly Results: Reflecting on Design and Evaluation Methods for Idiosyncratic Personal Information Management Tasks

2008年2月10日 00:00:00 GMT

Wicked Problems and Gnarly Results: Reflecting on Design and Evaluation Methods for Idiosyncratic Personal Information Management Tasks Bernstein, Michael; Van Kleek, Max; Khushraj, Deepali; Nayak, Rajeev; Liu, Curtis; schraefel, mc; Karger, David R. This paper is a case study of an artifact design and evaluation process; it is a reflection on how right thinking about design methods may at times result in sub-optimal results. Our goal has been to assess our decision making processthroughout the design and evaluation stages for a software prototype in order to consider where design methodology may need to be tuned to be more sensitive to the domain of practice, in this case software evaluation in personal information management. In particular, we reflect on design methods around (1) scale of prototype, (2) prototyping and design process, (3) study design, and (4) study population.

Finding Bugs In Dynamic Web Applications

2008年2月06日 00:00:00 GMT

Finding Bugs In Dynamic Web Applications Artzi, Shay; Kiezun, Adam; Dolby, Julian; Tip, Frank; Dig, Danny; Paradkar, Amit; Ernst, Michael D. Web script crashes and malformed dynamically-generated web pages are common errors, and they seriously impact usability of web applications. Currenttools for web-page validation cannot handle the dynamically-generatedpages that are ubiquitous on today's Internet.In this work, we apply a dynamic test generation technique, based oncombined concrete and symbolic execution, to the domain of dynamic webapplications. The technique generates tests automatically andminimizes the bug-inducing inputs to reduce duplication and to makethe bug reports small and easy to understand and fix.We implemented the technique in Apollo, an automated tool thatfound dozens of bugs in real PHP applications. Apollo generatestest inputs for the web application, monitors the application forcrashes, and validates that the output conforms to the HTMLspecification. This paper presents Apollo's algorithms andimplementation, and an experimental evaluation that revealed a totalof 214 bugs in 4 open-source PHP web applications.

WaveScript: A Case-Study in Applying a Distributed Stream-Processing Language

2008年1月31日 00:00:00 GMT

WaveScript: A Case-Study in Applying a Distributed Stream-Processing Language Newton, Ryan; Girod, Lewis; Craig, Michael; Madden, Sam; Morrisett, Greg Applications that combine live data streams with embedded, parallel,and distributed processing are becoming more commonplace. WaveScriptis a domain-specific language that brings high-level, type-safe,garbage-collected programming to these domains. This is made possibleby three primary implementation techniques. First, we employ a novelevaluation strategy that uses a combination of interpretation andreification to partially evaluate programs into stream dataflowgraphs. Second, we use profile-driven compilation to enable manyoptimizations that are normally only available in the synchronous(rather than asynchronous) dataflow domain. Finally, we incorporatean extensible system for rewrite rules to capture algebraic propertiesin specific domains (such as signal processing).We have used our language to build and deploy a sensor-network for theacoustic localization of wild animals, in particular, theYellow-Bellied marmot. We evaluate WaveScript's performance on thisapplication, showing that it yields good performance on both embeddedand desktop-class machines, including distributed execution andsubstantial parallel speedups. Our language allowed us to implementthe application rapidly, while outperforming a previous Cimplementation by over 35%, using fewer than half the lines of code.We evaluate the contribution of our optimizations to this success.

Cabernet: A Content Delivery Network for Moving Vehicles

2008年1月17日 00:00:00 GMT

Cabernet: A Content Delivery Network for Moving Vehicles Eriksson, Jakob; Balakrishnan, Hari; Madden, Sam This paper describes the design, implementation, and evaluation of Cabernet, a system to deliver data to and from moving vehicles using open 802.11 (WiFi) access points encountered opportunistically during travel. Network connectivity in Cabernet is both fleeting (access points are typicallywithin range for a few seconds) and intermittent (because the access points don't provide continuous coverage), and suffers from high packet loss rates over the wireless channel. On the positive side, in the absence of losses, achievable data rates over WiFi can reach many megabits per second. Unfortunately, current protocols don't establish end-to-end connectivity fast enough, don't cope well with intermittent connectivity, and don't handle high packet loss rates well enough to achieve this potential throughput. Cabernet incorporates two new techniques to improve data delivery throughput: QuickWifi, a streamlined client-side process to establish end-to-end connectivity quickly, reducing the mean time to establish connectivity from 12.9 seconds to less than 366 ms and CTP, a transport protocol that distinguishes congestion on the wired portion of the path from losses over the wireless link to reliably and efficiently deliver data to nodes in cars. We have deployed the system on a fleet of 10 taxis, each running several hours per day in the Boston area. Our experiments show that CTP improves throughput by a factor of 2x over TCP and that QuickWifi increases the number of connectionsby a factor of 4x over unoptimized approaches. Thus, Cabernet is perhaps the first practical system capable of delivering data to moving vehicles over existing short-range WiFi radios, with a mean transfer capacity of approximately 38 megabytes/hour per car, or a mean rate of 87 kbit/s.

Exact Algorithms for the Canadian Traveller Problem on Paths and Trees

2008年1月28日 00:00:00 GMT

Exact Algorithms for the Canadian Traveller Problem on Paths and Trees Karger, David; Nikolova, Evdokia The Canadian Traveller problem is a stochastic shortest paths problem in which one learns the cost of an edge only when arriving at one of its endpoints. The goal is to find an adaptive policy (adjusting as one learns more edge lengths) that minimizes the expected cost of travel. The problem is known to be #P hard. Since there has been no significant progress on approximation algorithms for several decades, we have chosen to seek out special cases for which exact solutions exist, in the hope of demonstrating techniques that could lead to further progress. Applying techniques from the theory of Markov Decision Processes, we give an exact solution for graphs of parallel (undirected) paths from source to destination with random two-valued edge costs. We also offer a partial generalization to traversing perfect binary trees.

Simulation of Human Motion Data using Short-Horizon Model-Predictive Control

2008年1月15日 00:00:00 GMT

Simulation of Human Motion Data using Short-Horizon Model-Predictive Control Silva, Marco da; Abe, Yeuhi; Popovic, Jovan Many data-driven animation techniques are capable of producing high quality motions of human characters. Few techniques, however, are capable of generating motions that are consistent with physically simulated environments. Physically simulated characters, in contrast, are automatically consistent with the environment, but their motionsare often unnatural because they are difficult to control. We present a model-predictive controller that yields natural motions by guiding simulated humans toward real motion data. During simulation, the predictive component of the controller solves a quadratic program to compute the forces for a short window of time into the future. These forces are then applied by a low-gain proportional-derivative component, which makes minor adjustments until the next planning cycle. The controller is fast enough for interactive systems such as games and training simulations. It requires no precomputation and little manual tuning. The controller is resilient to mismatches between the character dynamics and the input motion, which allows it to track motion capture data even where the real dynamics are not known precisely. The same principled formulation can generate natural walks, runs, and jumps in a number of different physically simulated surroundings.

Theories in Practice: Easy-to-Write Specifications that Catch Bugs

2008年1月14日 00:00:00 GMT

Theories in Practice: Easy-to-Write Specifications that Catch Bugs Saff, David; Boshernitsan, Marat; Ernst, Michael D. Automated testing during development helps ensure that software works according to the test suite. Traditional test suites verify a few well-picked scenarios or example inputs. However, such example-based testing does not uncover errors in legal inputs that the test writer overlooked. We propose theory-based testing as an adjunct to example-based testing. A theory generalizes a (possibly infinite) set of example-based tests. A theory is an assertion that should be true for any data, and it can be exercised by human-chosen data or by automatic data generation. A theory is expressed in an ordinary programming language, it is easy for developers to use (often even easier than example-based testing), and it serves as a lightweight form of specification. Six case studies demonstrate the utility of theories that generalize existing tests to prevent bugs, clarify intentions, and reveal design problems.

Sparse recovery using sparse matrices

2008年1月10日 00:00:00 GMT

Sparse recovery using sparse matrices Berinde, Radu; Indyk, Piotr We consider the approximate sparse recovery problem, where the goal is to (approximately) recover a high-dimensional vector x from its lower-dimensional sketch Ax. A popular way of performing this recovery is by finding x* such that Ax=Ax*, and ||x*||_1 is minimal. It is known that this approach ``works'' if A is a random *dense* matrix, chosen from a proper distribution.In this paper, we investigate this procedure for the case where A is binary and *very sparse*. We show that, both in theory and in practice, sparse matrices are essentially as ``good'' as the dense ones. At the same time, sparse binary matrices provide additional benefits, such as reduced encoding and decoding time.

Relational Envelope-based Planning

2007年12月31日 00:00:00 GMT

Relational Envelope-based Planning Gardiol, Natalia Hernandez This thesis proposes a synthesis of logic and probability for solving stochastic sequential decision-making problems. We address two main questions: How can we take advantage of logical structure to speed up planning in a principled way? And, how can probability inform the production of a more robust, yet still compact, policy? We can take as inspiration a mobile robot acting in the world: it is faced with a varied amount ofsensory data and uncertainty in its action outcomes. Or, consider a logistics planning system: it must deliver a large number of objects to the right place at the right time. Many interesting sequential decision-making domains involve large statespaces, large stochastic action sets, and time pressure to act. In this work, we show how structured representations of the environment's dynamics can constrain and speed up the planning process. We start with a problem domain described in a probabilistic logical description language.Our technique is based on, first, identifying the most parsimonious representation that permits solution of the described problem. Next, we take advantage of the structured problem description to dynamically partition the action space into a set of equivalence classes with respect to this minimal representation. The partitioned action space results in fewer distinctactions. This technique can yield significant gains in planning efficiency.Next, we develop an anytime technique to elaborate on this initial plan. Our approach uses the envelope MDP framework, which creates a Markov decision process out of a subset of the possible state space. This strategy lets an agent begin acting quicklywithin a restricted part of the full state space, as informed by the original plan,and to judiciously expand its envelope as resources permit.Finally, we show how the representation space itself can be elaborated within the anytime framework. This approach balances the need to respond to time-pressure and to produce the most robust policies possible. We present experimental results in some synthetic planning domains and in a simulated military logistics domain.

Learning complex cell invariance from natural videos: A plausibility proof

2007年12月26日 00:00:00 GMT

Learning complex cell invariance from natural videos: A plausibility proof Masquelier, Timothee; Serre, Thomas; Thorpe, Simon; Poggio, Tomaso One of the most striking feature of the cortex is its ability to wire itself. Understanding how the visual cortex wires up through development and how visual experience refines connections into adulthood is a key question for Neuroscience. While computational models of the visual cortex are becoming increasingly detailed, the question of how such architecture could self-organize through visual experience is often overlooked. Here we focus on the class of hierarchical feedforward models of the ventral stream of the visual cortex, which extend the classical simple-to-complex cells model by Hubel and Wiesel (1962) to extra-striate areas, and have been shown to account for a host of experimental data. Such models assume two functional classes of simple and complex cells with specific predictions about their respective wiring and resulting functionalities.In these networks, the issue of learning, especially for complex cells, is perhaps the least well understood. In fact, in most of these models, the connectivity between simple and complex cells is not learned butrather hard-wired. Several algorithms have been proposed for learning invariances at the complex cell level based on a trace rule to exploit the temporal continuity of sequences of natural images, but very few can learn from natural cluttered image sequences.Here we propose a new variant of the trace rule that only reinforces the synapses between the most active cells, and therefore can handle cluttered environments. The algorithm has so far been developed and tested at the level of V1-like simple and complex cells: we verified that Gabor-like simple cell selectivity could emerge from competitive Hebbian learning. In addition, we show how the modified trace rule allows the subsequent complex cells to learn to selectively pool over simple cells with the same preferred orientation but slightly different positions thus increasing their tolerance to the precise position of the stimulus within their receptive fields.

Report on the Probabilistic Language Scheme

2007年10月22日 00:00:00 GMT

Report on the Probabilistic Language Scheme Radul, Alexey Reasoning with probabilistic models is a widespread andsuccessful technique in areas ranging from computer vision, to naturallanguage processing, to bioinformatics. Currently, these reasoningsystems are either coded from scratch in general-purpose languages oruse formalisms such as Bayesian networks that have limited expressivepower. In both cases, the resulting systems are difficult to modify,maintain, compose, and interoperate with. This work presents ProbabilisticScheme, an embedding of probabilistic computation into Scheme. Thisgives programmers an expressive language for implementing modularprobabilistic models that integrate naturally with the rest of Scheme.

Team MIT Urban Challenge Technical Report

2007年12月14日 00:00:00 GMT

Team MIT Urban Challenge Technical Report Leonard, John; Barrett, David; How, Jonathan; Teller, Seth; Antone, Matt; Campbell, Stefan; Epstein, Alex; Fiore, Gaston; Fletcher, Luke; Frazzoli, Emilio; Huang, Albert; Jones, Troy; Koch, Olivier; Kuwata, Yoshiaki; Mahelona, Keoni; Moore, David; Moyer, Katy; Olson, Edwin; Peters, Steven; Sanders, Chris; Teo, Justin; Walter, Matthew This technical report describes Team MIT’s approach to theDARPA Urban Challenge. We have developed a novel strategy forusing many inexpensive sensors, mounted on the vehicle periphery,and calibrated with a new cross-modal calibrationtechnique. Lidar, camera, and radar data streams are processedusing an innovative, locally smooth state representation thatprovides robust perception for real time autonomous control. Aresilient planning and control architecture has been developedfor driving in traffic, comprised of an innovative combination ofwellproven algorithms for mission planning, situationalplanning, situational interpretation, and trajectory control. These innovations are being incorporated in two new roboticvehicles equipped for autonomous driving in urban environments,with extensive testing on a DARPA site visit course. Experimentalresults demonstrate all basic navigation and some basic trafficbehaviors, including unoccupied autonomous driving, lanefollowing using pure-pursuit control and our local frameperception strategy, obstacle avoidance using kino-dynamic RRTpath planning, U-turns, and precedence evaluation amongst othercars at intersections using our situational interpreter. We areworking to extend these approaches to advanced navigation andtraffic scenarios.

Quantitative Information Flow as Network Flow Capacity

2007年12月10日 00:00:00 GMT

Quantitative Information Flow as Network Flow Capacity McCamant, Stephen; Ernst, Michael D. We present a new technique for determining how much information abouta program's secret inputs is revealed by its public outputs. Incontrast to previous techniques based on reachability from secretinputs (tainting), it achieves a more precise quantitative result bycomputing a maximum flow of information between the inputs andoutputs. The technique uses static control-flow regions to soundlyaccount for implicit flows via branches and pointer operations, butoperates dynamically by observing one or more program executions andgiving numeric flow bounds specific to them (e.g., "17 bits"). Themaximum flow in a network also gives a minimum cut (a set of edgesthat separate the secret input from the output), which can be used toefficiently check that the same policy is satisfied on futureexecutions. We performed case studies on 5 real C, C++, and ObjectiveC programs, 3 of which had more than 250K lines of code. The toolchecked multiple security policies, including one that was violated bya previously unknown bug.

Verifiably Secure Devices

2007年12月05日 00:00:00 GMT

Verifiably Secure Devices Lepinski, Matt; Micali, Silvio; Izmalkov, Sergei We put forward the notion of a verifiably secure device, in essence a stronger notion of secure computation, and achieve it in the ballot-box model. Verifiably secure devices1. Provide a perfect solution to the problem of achieving correlated equilibrium, an important and extensively investigated problem at the intersection of game theory, cryptography and efficient algorithms; and2. Enable the secure evaluation of multiple interdependent functions.

Mapping Stream Programs into the Compressed Domain

2007年11月30日 00:00:00 GMT

Mapping Stream Programs into the Compressed Domain Thies, William; Hall, Steven; Amarasinghe, Saman Due to the high data rates involved in audio, video, and signalprocessing applications, it is imperative to compress the data todecrease the amount of storage used. Unfortunately, this implies thatany program operating on the data needs to be wrapped by adecompression and re-compression stage. Re-compression can incursignificant computational overhead, while decompression swamps theapplication with the original volume of data.In this paper, we present a program transformation that greatlyaccelerates the processing of compressible data. Given a program thatoperates on uncompressed data, we output an equivalent program thatoperates directly on the compressed format. Our transformationapplies to stream programs, a restricted but useful class ofapplications with regular communication and computation patterns. Ourformulation is based on LZ77, a lossless compression algorithm that isutilized by ZIP and fully encapsulates common formats such as AppleAnimation, Microsoft RLE, and Targa.We implemented a simple subset of our techniques in the StreamItcompiler, which emits executable plugins for two popular video editingtools: MEncoder and Blender. For common operations such as coloradjustment and video compositing, mapping into the compressed domainoffers a speedup roughly proportional to the overall compressionratio. For our benchmark suite of 12 videos in Apple Animationformat, speedups range from 1.1x to 471x, with a median of 15x.

ReCrash: Making Crashes Reproducible

2007年11月20日 00:00:00 GMT

ReCrash: Making Crashes Reproducible Kim, Sunghun; Artzi, Shay; Ernst, Michael D. It is difficult to fix a problem without being able to reproduce it.However, reproducing a problem is often difficult and time-consuming.This paper proposes a novel algorithm, ReCrash, that generatesmultiple unit tests that reproduce a given program crash.ReCrash dynamically tracks method calls during every execution of the target program. If the program crashes, ReCrash saves information about the relevant method calls and uses the saved information to create unit tests reproducing the crash.We present reCrashJ an implementation of ReCrash for Java. reCrashJ reproducedreal crashes from javac, SVNKit, Eclipse JDT, and BST. reCrashJ is efficient, incurring 13%-64% performance overhead. If this overhead is unacceptable, then reCrashJ has another mode that has negligible overhead until a crash occurs and 0%-1.7% overhead until a second crash, at which point the test cases are generated.

Towards Feature Selection In Actor-Critic Algorithms

2007年11月01日 00:00:00 GMT

Towards Feature Selection In Actor-Critic Algorithms Rohanimanesh, Khashayar; Roy, Nicholas; Tedrake, Russ Choosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a well-studied class of actor policies satisfy the known requirements for convergence when the actor features are selected carefully. We demonstrate that two popular representations for value methods - the barycentric interpolators and the graph Laplacian proto-value functions - can be used to represent the actor in order to satisfy these conditions. A consequence of this work is a generalization of the proto-value function methods to the continuous action actor-critic domain. Finally, we analyze the performance of this approach using a simulation of a torque-limited inverted pendulum.

Transfering Nonlinear Representations using Gaussian Processes with a Shared Latent Space

2007年11月06日 00:00:00 GMT

Transfering Nonlinear Representations using Gaussian Processes with a Shared Latent Space Urtasun, Raquel; Quattoni, Ariadna; Darrell, Trevor When a series of problems are related, representations derived fromlearning earlier tasks may be useful in solving later problems. Inthis paper we propose a novel approach to transfer learning withlow-dimensional, non-linear latent spaces. We show how suchrepresentations can be jointly learned across multiple tasks in adiscriminative probabilistic regression framework. When transferred tonew tasks with relatively few training examples, learning can befaster and/or more accurate. Experiments on a digit recognition taskshow significantly improved performance when compared to baselineperformance with the original feature representation or with arepresentation derived from a semi-supervised learning approach.

Collusion-Resilient Revenue In Combinatorial Auctions

2007年11月02日 00:00:00 GMT

Collusion-Resilient Revenue In Combinatorial Auctions Valiant, Paul; Micali, Silvio In auctions of a single good, the second-price mechanism achieves, in dominantstrategies, a revenue benchmark that is naturally high and resilient to anypossible collusion.We show how to achieve, to the maximum extent possible, the same propertiesin combinatorial auctions.

Set Interfaces for Generalized Typestate and Data Structure Consistency Verification

2007年10月31日 00:00:00 GMT

Set Interfaces for Generalized Typestate and Data Structure Consistency Verification Lam, Patrick; Zee, Karen; Kuncak, Viktor; Rinard, Martin Typestate systems allow the type of an object to change during its lifetime in the computation. Unlike standard type systems, they can enforce safety properties that depend on changing object states. We present a new, generalized formulation of typestate that models the typestate of an object through membership in abstract sets. This abstract set formulation enables developers to reason about cardinalities of sets, and in particular to state and verify the condition that certain sets are empty. We support hierarchical typestate classifications by specifying subset and disjointness properties over the typestate sets.We present our formulation of typestate in the context of the Hob program specification and verification framework. The Hob framework allows the combination of typestate analysis with powerful independently developed analyses such as shape analyses or theorem proving techniques. We implemented our analysis and annotated several programs (75-2500 lines of code) with set specifications. Our implementation includes several optimizations that improve the scalability of the analysis and a novel loop invariant inferencealgorithm that eliminates the need to specify loop invariants. We present experimental data demonstrating the effectiveness of our techniques.

Fast Self-Healing Gradients

2008年3月01日 00:00:00 GMT

Fast Self-Healing Gradients Beal, Jacob; Bachrach, Jonathan; Vickery, Dan; Tobenkin, Mark We present CRF-Gradient, a self-healing gradient algorithm that provably reconfigures in O(diameter) time. Self-healing gradients are a frequently used building block for distributed self-healing systems, but previous algorithms either have a healing rate limited by the shortest link in the network or must rebuild invalid regions from scratch. We have verified CRF-Gradient in simulation and on a network of Mica2 motes. Our approach can also be generalized and applied to create other self-healing calculations, such as cumulative probability fields.

Pluggable type-checking for custom type qualifiers in Java

2007年9月17日 00:00:00 GMT

Pluggable type-checking for custom type qualifiers in Java Papi, Matthew M.; Ali, Mahmood; Correa Jr., Telmo Luis; Perkins, Jeff H.; Ernst, Michael D. We have created a framework for adding custom type qualifiers to the Javalanguage in a backward-compatible way. The type system designer definesthe qualifiers and creates a compiler plug-in that enforces theirsemantics. Programmers can write the type qualifiers in their programs andbe informed of errors or assured that the program is free of those errors.The system builds on existing Java tools and APIs.In order to evaluate our framework, we have written four type-checkersusing the framework: for a non-null type system that can detect andprevent null pointer errors; for an interned type system that can detectand prevent equality-checking errors; for a reference immutability typesystem, Javari, that can detect and prevent mutation errors; and for areference and object immutability type system, IGJ, that can detect andprevent even more mutation errors. We have conducted case studies usingeach checker to find real errors in existing software. These case studiesdemonstrate that the checkers and the framework are practical and useful.

MIXIT: The Network Meets the Wireless Channel

2007年9月04日 00:00:00 GMT

MIXIT: The Network Meets the Wireless Channel Katti, Sachin; Katabi, Dina The traditional contract between the network and the lower layers states that the network does routing and the lower layers deliver correct packets. In a wireless network, however, different nodes may hear most bits in a transmission, yet none of them receives the whole packet uncorrupted. The current approach imposes fatesharing on the bits, dropping a whole packet because of a few incorrect bits. In contrast, this paper proposes MIXIT, a new architecture that performs opportunistic routing on groups of correctly received symbols. We show using simulations driven with Software Radios measurements that MIXIT provides 4ドル$x throughput improvement over state-of-the-art opportunistic routing.

Factors Affecting the Adoption of Faculty-Developed Academic Software: A Study of Five iCampus Projects

2007年10月20日 00:00:00 GMT

Factors Affecting the Adoption of Faculty-Developed Academic Software: A Study of Five iCampus Projects Ehrmann, Stephen C.; Gilbert, Steven W.; McMartin, Flora; Abelson, Harold; Long, Philip D. Instruction in higher education must adapt more rapidly to: changes in workforce needs, global issues, advances in disciplines, and resource constraints. The pace of such improvement depends on the speed with which new ideas and materials are adopted across institutions. In 1999 Microsoft pledged 25ドル million and staff support for iCampus, a seven-year MIT project to develop pioneering uses of educational technology. The TLT Group studied five iCampus projects in order to identify factors affecting institutionalization and widespread dissemination. Among the factors impeding adoption: lack of rewards and support for faculty to adopt innovations; faculty isolation; and a lack of attention to adoption issues among projects selected for funding. The study made recommendations for universities, foundations, government agencies and corporations: 1) continue making education more authentic, active, collaborative, and feedback-rich; 2) create demand to adopt ideas and materials from other sources by encouraging all faculty members to improve and document learning in their programs, year after year; 3) nurture coalitions for instructional improvement, across and within institutions; 4) create more effective higher education – corporate alliances; and 5) improve institutional services to support faculty in educational design, software development, assessment methods, formative evaluation, and/or in sharing ideas with others who teach comparable courses.

World Wide Web Without Walls

2007年8月24日 00:00:00 GMT

World Wide Web Without Walls Brodsky, Micah Z. (Micah Zev); Krohn, Maxwell; Morris, Robert; Walfish, Michael; Yip, Alexander Today's Web is built upon a particular symbiotic relationship betweensites and users: the sites invest capital to create and market a setof features, and users gain access to the sites often in exchange fortheir data (e.g., photos, personal information, creative musings,etc.). This paper imagines a very different Web ecosystem, in whichusers retain control of their data and developers can justify theirexistence without hoarding user data.

Constraint and Restoring Force

2007年8月24日 00:00:00 GMT

Constraint and Restoring Force Beal, Jacob; Bachrach, Jonathan; Tobenkin, Mark Long-lived sensor network applications must be able to self-repair and adapt to changing demands. We introduce a new approach for doing so: Constraint and Restoring Force. CRF is a physics-inspired framework for computing scalar fields across a sensor network with occasional changes. We illustrate CRF’s usefulness by applying it to gradients, a common building block for sensor network systems. The resulting algorithm, CRF-Gradient, determines locally when to self-repair and when to stop and save energy. CRF-Gradient is self-stabilizing, converges in O(diameter) time, and has been verified experimentally in simulation and on a network of Mica2 motes. Finally we show how CRF can be applied to other algorithms as well, such as the calculation of probability fields.

Learning by Learning To Communicate

2007年8月23日 00:00:00 GMT

Learning by Learning To Communicate Beal, Jacob Human intelligence is a product of cooperation among many different specialists. Much of this cooperation must be learned, but we do not yet have a mechanism that explains how this might happen for the "high-level" agile cooperation that permeates our daily lives.I propose that the various specialists learn to cooperate by learning to communicate, basing this proposal on the phenomenon of "communication bootstrapping," in which shared experiences form a basis for agreement on a system of signals. In this dissertation, I lay out a roadmap for investigating this hypothesis, identifying problems that must be overcome in order to understand the capabilities of communication bootstrapping and in order to test whether it is exploited by human intelligence.I then demonstrate progress along the course of investigation laid out in my roadmap:* I establish a measure of "developmental cost" that allows me to eliminate many possible designs* I develop a method of engineering devices for use in models of intelligence, including characterizing their behavior under a wide variety of conditions and compensating for their misbehavior using "failure simplification."* I develop mechanisms that reliably produce communication bootstrapping such that it can be used to connect specialists in an engineered system.* I construct a demonstration system including a simulated world and pair of observers that learn world dynamics via communication bootstrapping. PhD thesis

Factors Affecting the Adoption of Faculty-Developed Academic Software: A Study of Five iCampus Projects

2007年8月20日 00:00:00 GMT

Factors Affecting the Adoption of Faculty-Developed Academic Software: A Study of Five iCampus Projects Ehrmann, Stephen C.; Gilbert, Steven W.; McMartin, Flora Initiated in 1999, iCampus is a research collaboration between Microsoft Research and MIT whose goal is to create and demonstrate technologies with the potential for revolutionary change throughout the university curriculum.” The program was made possible by a 25ドル million research grant from Microsoft to MIT, and involves extensive collaboration between MIT and Microsoft staff.

This assessment study by the TLT Group addresses the question: The TLT Group has been asked, “In light of the experience of iCampus, especially those projects selected by MIT and Microsoft for close study, what can be learned about priorities for educational technology initiatives in the future and about how the spread of such innovations can be more effectively supported?”

The major conclusions are that the five projects studied improved important elements of an MIT education by making learning more authentic, active, collaborative, and feedback-rich. Nevertheless, wider adoption beyond MIT was extremely difficult to achieve, largely due to structure issues in universities that make it difficult for educational technology to spread beyond the initial innovators, even to other departments within the same institution. The report includes recommendations for universities, external sponsors, and for MIT in particular, about steps to take to achieve more effective dissemination.

Toward Secure Services from Untrusted Developers

2007年8月06日 00:00:00 GMT

Toward Secure Services from Untrusted Developers Brodsky, Micah Z. (Micah Zev); Efstathopoulos, Petros; Kaashoek, Frans; Kohler, Eddie; Krohn, Maxwell; Mazieres, David; Morris, Robert; VanDeBogart, Steve; Yip, Alexander We present a secure service prototype built from untrusted,contributed code.The service manages private data for a variety of different users, anduser programs frequently require access to other users' private data.However, aside from covert timing channels, no part of the service cancorrupt private data or leak it between users or outside the systemwithout permission from the data's owners.Instead, owners may choose to reveal their data in a controlled manner.This application model is demonstrated by Muenster, a job searchwebsite that protects both the integrity and secrecy of each user's data.In spite of running untrusted code, Muenster and other services canprevent overt leaks because the untrusted modules are constrained bythe operating system to follow pre-specified security policies, whichare nevertheless flexible enough for programmers to do useful work.We build Muenster atop Asbestos, a recently described operating systembased on a form of decentralized information flowcontrol.

Perfect Implementation of Normal-Form Mechanisms

2005年1月01日 00:00:00 GMT

Perfect Implementation of Normal-Form Mechanisms Izmalkov, Sergei; Lepinski, Matt; Micali, Silvio Privacy and trust affect our strategic thinking, yet they have not been precisely modeled in mechanism design. In settings of incomplete information, traditional implementations of a normal-form mechanism ---by disregarding the players' privacy, or assuming trust in a mediator--- may not be realistic and fail to reach the mechanism's objectives. We thus investigate implementations of a new type.We put forward the notion of a perfect implementation of a normal-form mechanism M: in essence, an extensive-form mechanism exactly preserving all strategic properties of M, without relying on a trusted party or violating the privacy of the players.We prove that ANY normal-form mechanism can be perfectly implemented via envelopes and an envelope-randomizing device (i.e., the same tools used for running fair lotteries or tallying secret votes).

Agent Organization and Request Propagation in the Knowledge Plane

2007年7月26日 00:00:00 GMT

Agent Organization and Request Propagation in the Knowledge Plane Li, Ji In designing and building a network like the Internet, we continue to face the problems of scale and distribution. In particular, network management has become an increasingly difficult task, and network applications often need to maintain efficient connectivity graphs for various purposes. The knowledge plane was proposed as a new construct to improve network management and applications. In this proposal, I propose an application-independent mechanism to support the construction of application-specific connectivity graphs. Specifically, I propose to build a network knowledge plane and multiple sub-planes for different areas of network services. The network knowledge plane provides valuable knowledge about the Internet to the sub-planes, and each sub-plane constructs its own connectivity graph using network knowledge and knowledge in its own specific area. I focus on two key design issues: (1) a region-based architecture for agent organization; (2) knowledge dissemination and request propagation. Network management and applications benefit from the underlying network knowledge plane and sub-planes. To demonstrate the effectiveness of this mechanism, I conduct case studies in network management and security.

Continuous Space-Time Semantics Allow Adaptive Program Execution

2007年7月01日 00:00:00 GMT

Continuous Space-Time Semantics Allow Adaptive Program Execution Bachrach, Jonathan; Beal, Jacob; Fujiwara, Takeshi A spatial computer is a collection of devices filling spacewhose ability to interact is strongly dependent on theirproximity. Previously, we have showed that programmingsuch a computer as a continuous space can allow self-scalingacross computers with different device distributionsand can increase robustness against device failure. Wehave extended these ideas to time, allowing self-scalingacross computers with different communication and executionrates. We have used a network of 24 Mica2 Motes todemonstrate that a program exploiting these ideas showsminimal difference in behavior as the time between programsteps ranges from 100 ms to 300 ms and on a configurationwith mixed rates.

Hierarchical Dirichlet Process-Based Models For Discovery of Cross-species Mammalian Gene Expression

2007年7月06日 00:00:00 GMT

Hierarchical Dirichlet Process-Based Models For Discovery of Cross-species Mammalian Gene Expression Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. An important research problem in computational biology is theidentification of expression programs, sets of co-activatedgenes orchestrating physiological processes, and thecharacterization of the functional breadth of these programs. Theuse of mammalian expression data compendia for discovery of suchprograms presents several challenges, including: 1) cellularinhomogeneity within samples, 2) genetic and environmental variationacross samples, and 3) uncertainty in the numbers of programs andsample populations. We developed GeneProgram, a new unsupervisedcomputational framework that uses expression data to simultaneouslyorganize genes into overlapping programs and tissues into groups toproduce maps of inter-species expression programs, which are sortedby generality scores that exploit the automatically learnedgroupings. Our method addresses each of the above challenges byusing a probabilistic model that: 1) allocates mRNA to differentexpression programs that may be shared across tissues, 2) ishierarchical, treating each tissue as a sample from a population ofrelated tissues, and 3) uses Dirichlet Processes, a non-parametricBayesian method that provides prior distributions over numbers ofsets while penalizing model complexity. Using real gene expressiondata, we show that GeneProgram outperforms several popularexpression analysis methods in recovering biologically interpretablegene sets. From a large compendium of mouse and human expressiondata, GeneProgram discovers 19 tissue groups and 100 expressionprograms active in mammalian tissues. Our method automaticallyconstructs a comprehensive, body-wide map of expression programs andcharacterizes their functional generality. This map can be used forguiding future biological experiments, such as discovery of genesfor new drug targets that exhibit minimal "cross-talk" withunintended organs, or genes that maintain general physiologicalresponses that go awry in disease states. Further, our method isgeneral, and can be applied readily to novel compendia of biologicaldata.

Using The Barton Libraries Dataset As An RDF benchmark

2007年7月06日 00:00:00 GMT

Using The Barton Libraries Dataset As An RDF benchmark Abadi, Daniel J.; Marcus, Adam; Madden, Samuel R.; Hollenbach, Kate This report describes the Barton Libraries RDF dataset and Longwell querybenchmark that we use for our recent VLDB paper on Scalable Semantic WebData Management Using Vertical Partitioning.

Table 2 (Supplemental): Complete data for all 100 expression programs discovered by GeneProgram from the Novartis Gene Atlas v2

2007年6月25日 00:00:00 GMT

Table 2 (Supplemental): Complete data for all 100 expression programs discovered by GeneProgram from the Novartis Gene Atlas v2 Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. Table 2 (Supplemental): Complete data for all 100 recurrent expression programs (EPs) discovered by GeneProgram. Each EP has two identifying rows, a list of meta-genes, and a list of significantly enriched GO categories. The first identifying row has three columns: (1) the EP identifier (an arbitrarily assigned number), (2) the number of meta-genes in the EP, and (3) the percentage of samples the EP occurs in. The identifying row lists all tissues that use the EP (h_ = human tissue, m_ = mouse tissue). Numbers in parentheses next to each tissue indicate the degree to which the tissue uses the EP.After the identifying rows the set of meta-genes in the EP are listed. Each meta-gene has eight columns: (1) the human RefSeq identifier, (2) the mouse RefSeq identifier, (3) the empirical mean expression level, (4) the empirical mean occurrence percentage, (5) the human gene name, (6) the human Swis-Prot description, (7) the mouse gene name, and (8) the mouse Swis-Prot description.Following the meta-genes are lists of significant GO categories (the first list uses human annotations, and the second uses mouse annotations). The columns for each line in this list are: (1) GO term, (2) enrichment p-value, (3) number of genes in the EP in the category/total genes in the EP with some GO category, (4) category description, and (5) total number of genes in the category that are also in the dataset analyzed.

Table 1 (Supplemental): Summary of expression programs discovered by GeneProgram from Novartis Tissue Atlas v2 data

2007年6月25日 00:00:00 GMT

Table 1 (Supplemental): Summary of expression programs discovered by GeneProgram from Novartis Tissue Atlas v2 data Gerber, Georg K.; Dowell, Robin D.; Jaakkola, Tommi S.; Gifford, David K. Table 1 (Supplemental): Summary of recurrent expression programs (EPs) discovered by GeneProgram. The columns are: (1) the EP identifier (an arbitrarily assigned number), (2) the number of genes in the EP, (3) the number of tissues in the EP, (4) the species using the EP (i.e., one or more tissues from the species uses the EP, H = human, M = mouse), (5) the generality score (GS), (6) the top three tissues using the EP (numbers in parentheses = usage percentages), (7)-(9) the GO category name, GO term, and associated p-value for the most abundant significantly enriched category (i.e., the significant category with the most genes overlapping with the EP's genes).

Stateful Anycast for DDoS Mitigation

2007年6月21日 00:00:00 GMT

Stateful Anycast for DDoS Mitigation Hansen, Richard E. Distributed denial-of-service (DDoS) attacks can easily cripple victim hosts or networks, yet effective defenses remain elusive. Normal anycast can be used to force the diffusion of attack traffic over a group of several hosts to increase the difficulty of saturating resources at or near any one of the hosts. However, because a packet sent to the anycast group may be delivered to any member, anycast does not support protocols that require a group member to maintain state (such as TCP). This makes anycast impractical for most applications of interest.This document describes the design of Stateful Anycast, a conceptual anycast-like network service based on IP anycast. Stateful Anycast is designed to support stateful sessions without losing anycast’s ability to defend against DDoS attacks. Stateful Anycast employs a set of anycasted proxies to direct packets to the proper stateholder. These proxies provide DDoS protection by dropping a session’s packets upon group member request. Stateful Anycast is incrementally deployable and can scale to support many groups. MEng thesis

Information Accountability

2007年6月13日 00:00:00 GMT

Information Accountability Weitzner, Daniel J.; Abelson, Harold; Berners-Lee, Tim; Feigenbaum, Joan; Hendler, James; Sussman, Gerald Jay Ease of information flow is both the boon and the bane of large-scale, decentralized systems like the World Wide Web. For all the benefits and opportunities brought by the information revolution, with that same revolution have come the challenges of inappropriate use. Such excesses and abuses in the use of information are most commonly viewed through the lens of information security. This paper argues that debates over online privacy, copyright, and information policy questions have been overly dominated by the access restriction perspective. Our alternative is to design systems that are oriented toward information accountability and appropriate use, rather than information security and access restriction. Our goal is to extend the Web architecture to support transparency and accountability.

The Psychophysiology of Risk Processing and Decision Making at a Regional Stock Exchange

2007年6月12日 00:00:00 GMT

The Psychophysiology of Risk Processing and Decision Making at a Regional Stock Exchange Perry, John C. A longstanding controversy in philosophy is whether decision-making isgoverned by reason or emotion. I study the role of physiologicalresponses in the decision-making process within the realm of financialmarkets, where both the environment and decisions---trades---aremeasurable.In an experiment performed on a regional stock exchange, mycollaborators and I record six different types of physiologicalsignals---skin conductance/galvanic skin response (SCR/GSR), bloodvolume pulse (BVP), electrocardiogram (ECG),electroencephalogram (EEG), electromyogram (EMG), andtemperature (Temp)---of monetarily motivated professionals making highpressure decisions. From these signals I estimate underlyingphysiological features, such as heart rate,changes in body temperature, and amplitude of SCR, which are proxy foraffect. Simultaneously, we record real-time market information whichthe specialists process and which serves as the basis for theirdecisions, as well as recording their decisions and outcomes.In a sample of eight market-makers, I find statistically significantdifferences in mean skin conductance response and cardiovascularvariables during transient market events relative to no-market-eventcontrol intervals. In addition, I find a strong relationship betweentrading decisions and physiological responses. Using regression, Idemonstrate that heart rate variability can statisticallysignificantly improve predictions of trading decisions, although notby much. PhD thesis

An Analysis of Posynomial MOSFET Models Using Genetic Algorithms and Visualization

2007年6月05日 00:00:00 GMT

An Analysis of Posynomial MOSFET Models Using Genetic Algorithms and Visualization Salameh, Lynne Rafik Analog designers are interested in optimization tools which automate the process of circuit sizing. Geometric programming, which uses posynomial models of MOSFET parameters, represents one such tool. Genetic algorithms have been used to evolve posynomial models for geometric programs, with a reasonable mean error when modeling MOSFET parameters. By visualizing MOSFET data using two dimensional plots, this thesis investigates the behavior of various MOSFET small and large signal parameters and consequently proposes a lower bound on the maximum error, which a posynomial cannot improve upon. It then investigates various error metrics which can be used to balance the mean and maximum errors generated by posynomial MOSFET models. Finally, the thesis uses empirical data to verify the existence of the lower bound, and compares the maximum error from various parameters modeled by the genetic algorithm and by monomial fitting. It concludes that posynomial MOSFET models suffer from inherent inaccuracies. Additionally, although genetic algorithms improve on the maximum model error, the improvement, in general, does not vastly surpass results obtained through monomial fitting, which is a less computationally intensive method. Genetic algorithms are hence best used when modeling partially convex MOSFET parameters, such as r0 . MEng thesis

CAPRI: A Common Architecture for Distributed Probabilistic Internet Fault Diagnosis

2007年6月04日 00:00:00 GMT

CAPRI: A Common Architecture for Distributed Probabilistic Internet Fault Diagnosis Lee, George J. This thesis presents a new approach to root cause localization and fault diagnosis in the Internet based on a Common Architecture for Probabilistic Reasoning in the Internet (CAPRI) in which distributed, heterogeneous diagnostic agents efficiently conduct diagnostic tests and communicate observations, beliefs, and knowledge to probabilistically infer the cause of network failures. Unlike previous systems that can only diagnose a limited set of network component failures using a limited set of diagnostic tests, CAPRI provides a common, extensible architecture for distributed diagnosis that allows experts to improve the system by adding new diagnostic tests and new dependency knowledge.To support distributed diagnosis using new tests and knowledge, CAPRI must overcome several challenges including the extensible representation and communication of diagnostic information, the description of diagnostic agent capabilities, and efficient distributed inference. Furthermore, the architecture must scale to support diagnosis of a large number of failures using many diagnostic agents. To address these challenges, this thesis presents a probabilistic approach to diagnosis based on an extensible, distributed component ontology to support the definition of new classes of components and diagnostic tests; a service description language for describing new diagnostic capabilities in terms of their inputs and outputs; and a message processing procedure for dynamically incorporating new information from other agents, selecting diagnostic actions, and inferring a diagnosis using Bayesian inference and belief propagation.To demonstrate the ability of CAPRI to support distributed diagnosis of real-world failures, I implemented and deployed a prototype network of agents on Planetlab for diagnosing HTTP connection failures. Approximately 10,000 user agents and 40 distributed regional and specialist agents on Planetlab collect information from over 10,000 users and diagnose over 140,000 failures using a wide range of active and passive tests, including DNS lookup tests, connectivity probes, Rockettrace measurements, and user connection histories. I show how to improve accuracy and cost by learning new dependency knowledge and introducing new diagnostic agents. I also show that agents can manage the cost of diagnosing many similar failures by aggregating related requests and caching observations and beliefs. PhD thesis

Amorphous Computing

2007年1月01日 00:00:00 GMT

Amorphous Computing Abelson, Harold; Beal, Jacob; Sussman, Gerald Jay The goal of amorphous computing is to identify organizationalprinciples and create programming technologies for obtainingintentional, pre-specified behavior from the cooperation of myriadunreliable parts that are arranged in unknown, irregular, andtime-varying ways. The heightened relevance of amorphous computingtoday stems from the emergence of new technologies that could serve assubstrates for information processing systems of immense power atunprecedentedly low cost, if only we could master the challenge ofprogramming them. This document is a review of amorphous computing.

Local Geometry of Multiattribute Tradeoff Preferences

2007年2月01日 00:00:00 GMT

Local Geometry of Multiattribute Tradeoff Preferences McGeachie, Michael Existing preference reasoning systems have been successful insimple domains. Broader success requires more natural and moreexpressive preference representations. This thesis develops arepresentation of logical preferences that combines numericaltradeoff ratios between partial outcome descriptions withqualitative preference information. We argue our system is uniqueamong preference reasoning systems; previous work has focused onqualitative or quantitative preferences, tradeoffs, exceptions andgeneralizations, or utility independence, but none have combinedall of these expressions under a unified methodology.We present new techniques for representing and giving meaning toquantitative tradeoff statements between different outcomes. Thetradeoffs we consider can be multi-attribute tradeoffs relatingmore than one attribute at a time, they can refer to discrete orcontinuous domains, be conditional or unconditional, andquantified or qualitative. We present related methods ofrepresenting judgments of attribute importance. We then buildupon a methodology for representing arbitrary qualitative ceteris paribuspreference, or preferences ``other things being equal," aspresented in MD04. Tradeoff preferences inour representation are interpreted as constraints on the partialderivatives of the utility function. For example, a decision makercould state that ``Color is five times as important as price,availability, and time," a sentiment one might express in thecontext of repainting a home, and this is interpreted asindicating that utility increases in the positive color directionfive times faster than utility increases in the positive pricedirection. We show that these representations generalize both theeconomic notion of marginal rates of substitution and previousrepresentations of preferences in AI. PhD thesis

TIARA: Trust Management, Intrusion-tolerance, Accountability, and Reconstitution Architecture

2007年5月30日 00:00:00 GMT

TIARA: Trust Management, Intrusion-tolerance, Accountability, and Reconstitution Architecture Shrobe, Howard; Knight, Thomas; Hon, Andre de The last 20 years have led to unprecedented improvements in chipdensity and system performance fueled mainly by Moore's Law. Duringthe same time, system and application software have bloated, leadingto unmanageable complexity, vulnerability to attack, rigidity and lackof robustness and accountability. These problems arise from the factthat all key elements of the computational environment, from hardwarethrough system software and middleware to application code regard theworld as consisting of unconstrained ``raw seething bits''. No elementof the entire stack is responsible for enforcing over-archingconventions of memory structuring or access control. Outsiders mayeasily penetrate the system by exploiting vulnerabilities (e.g. bufferoverflows) arising from this lack of basic constraints. Attacks arenot easily contained, whether they originate from the clever outsiderwho penetrates the defenses or from the insider who exploits existingprivileges. Finally, because there are no facilities for tracing theprovenance of data, even when an attack is detected, it is difficultif not impossible to tell which data are traceable to the attack andwhat data may still be trusted. We have abundant computational resources allowing us to fix thesecritical problems using a combination of hardware, system software,and programming language technology: In this report, we describe theTIARAproject, which is using these resources to design a newcomputer system thatis less vulnerable, more tolerant of intrusions, capable of recoveryfrom attacks, and accountable for their actions. TIARA provides thesecapabilities without significant impact on overall system performance. Itachieves these goals through the judicious use of a modest amountof extra, but reasonably generable purpose, hardware that is dedicatedto tracking the provenance of data at a very fine grained level, toenforcing access control policies, and to constructing a coherentobject-oriented model of memory. This hardware runs in parallel withthe main data-paths of the system and operates on a set of extra bitstagging each word with data-type, bounds, access control andprovenance information. Operations that violate the intendedinvariants are trapped, while normal results are tagged withinformation derived from the tags of the input operands.This hardware level provides fine-grained support for a series ofsoftware layers that enable a variety of comprehensive access controlpolicies, self-adaptive computing, and fine-grained recoveryprocessing. The first of these software layers establishes aconsistent object-oriented level of computing while higher layersestablish wrappers that may not be bypassed, access controls, dataprovenance tracking. At the highest level we create the ``planlevel'' of computing in which code is executed in parallel with anabstract model (or executable specification) of the system that checkswhether the code behaves as intended.

Beyond the Bits: Cooperative Packet Recovery Using Physical Layer Information

2007年5月29日 00:00:00 GMT

Beyond the Bits: Cooperative Packet Recovery Using Physical Layer Information Woo, Grace Rusi; Kheradpour, Pouya; Katabi, Dina Wireless networks can suffer from high packet loss rates. This paper shows that the loss rate can be significantly reduced by exposing information readily available at the physical layer. We make the physical layer convey an estimate of its confidence that a particular bit is ``0'' or ``1'' to the higher layers. When used with cooperative design, this information dramatically improves the throughput of the wireless network. Access points that hear the same transmission combine their information to correct bits in a packet with minimal overhead. Similarly, a receiver may combine multiple erroneous transmissions to recover a correct packet. We analytically prove that our approach minimizes the errors in packet recovery. We also experimentally demonstrate its benefits using a testbed of GNU software radios. The results show that our approach can reduce loss rate by up to 10x in comparison with the current approach, and significantly outperforms prior cooperation proposals. PhD thesis

The Creation of OpenCourseWare at MIT

2007年5月19日 00:00:00 GMT

The Creation of OpenCourseWare at MIT Abelson, Harold This paper traces the genesis of the MIT OpenCourseWare project from its initial strategic precursors in 1999 and 2000, through its launch in 2001 and its subsequent evolution. The story told here illuminates the interplay among institutional leadership, and strategic planning, and with university culture in launching major educational technology enterprises. It also shows how initiatives can evolve in unexpected ways, and can even surpass their initial goals. The paper concludes with an overview of challenges facing OpenCourseWare in moving from the end of its production ramp-up and towards sustainability.

Developmental Cost for Models of Intelligence

2007年5月15日 00:00:00 GMT

Developmental Cost for Models of Intelligence Beal, Jacob We can evaluate models of natural intelligence, as well as theirindividual components, by using a model of hardware and developmentcosts, ignoring almost all the details of biology. The basic argumentis that neither the gross anatomy of the brain nor the behavior ofindividual cells nor the behavior of the whole poses sufficientconstraint on the algorithms that might run within the brain, but thatthe process of engineering an intelligence under this cost model posessimilar challenges to those faced by a human growing from a singlecell to an adult. This will allow us to explore architectural ideasfreely, yet retain confidence that when a system works, the principlesallowing it to work are likely to be similar to those that allow humanintelligence to work.

Notes on Regularized Least Squares

2007年5月01日 00:00:00 GMT

Notes on Regularized Least Squares Rifkin, Ryan M.; Lippert, Ross A. This is a collection of information about regularized least squares (RLS). The facts here are not “new results”, but we have not seen them usefully collected together before. A key goal of this work is to demonstrate that with RLS, we get certain things “for free”: if we can solve a single supervised RLS problem, we can search for a good regularization parameter lambda at essentially no additional cost.The discussion in this paper applies to “dense” regularized least squares, where we work with matrix factorizations of the data or kernel matrix. It is also possible to work with iterative methods such as conjugate gradient, and this is frequently the method of choice for large data sets in high dimensions with very few nonzero dimensions per point, such as text classifciation tasks. The results discussed here do not apply to iterative methods, which have different design tradeoffs.We present the results in greater detail than strictly necessary, erring on the side of showing our work. We hope that this will be useful to people trying to learn more about linear algebra manipulations in the machine learning context.

Tiny images

2007年4月23日 00:00:00 GMT

Tiny images Torralba, Antonio; Fergus, Rob; Freeman, William T. The human visual system is remarkably tolerant to degradations in image resolution: in a scene recognition task, human performance is similar whether 32ドル \times 32$ color images or multi-mega pixel images are used. With small images, even object recognition and segmentation is performed robustly by the visual system, despite the object being unrecognizable in isolation. Motivated by these observations, we explore the space of 32x32 images using a database of 10^8 32x32 color images gathered from the Internet using image search engines. Each image is loosely labeled with one of the 70,399 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database represents a dense sampling of all object categories and scenes. With this dataset, we use nearest neighbor methods to perform objectrecognition across the 10^8 images.

Principles for Engineered Emergence (slides)

2007年4月12日 00:00:00 GMT

Principles for Engineered Emergence (slides) Beal, Jacob Principles for Engineered EmergenceIt is difficult to establish engineering control over the behavior ofaggregates of unreliable devices with complicated interactionpatterns. I take a linguistic view of this problem, searching formechanisms that simplify the composition and abstraction ofcomplicated behaviors. From my work on various problems of aggregatecontrol in cognitive architectures and spatial computing, I havenoticed common themes in mechanisms that solve them. From these, Iextract four principles which seem to help in engineering robustaggregate behavior---self-scaling, sparseness, gradual degradation,and failure simplification---and give examples of how they can beexploited.

Self-Adaptive Systems for Information Survivability: PMOP and AWDRAT

2007年4月10日 00:00:00 GMT

Self-Adaptive Systems for Information Survivability: PMOP and AWDRAT Shrobe, Howard; Laddaga, Robert; Balzer, Robert; Goldman, Neil; Wile, Dave; Tallis, Marcelo; Hollebeek, Tim; Egyed, Alexander Information systems form the backbones of the critical infrastructures of modern societies. Unfortunately, these systems are highly vulnerable to attacks that can result in enormous damage. Furthermore, traditional approaches to information security have not provided all the protections necessary to defeat and recover from a concerted attack; in particular, they are largely irrelevant to the problem of defending against attacks launched by insiders.This paper describes two related systems PMOP and AWDRAT that were developed during the DARPA Self Regenerative Systems program. PMOP defends against insider attacks while AWDRAT is intended to detect compromises to software systems. Both rely on self-monitoring, diagnosis and self-adaptation. We describe both systems and show the results of experiments with each.

A Few Days of A Robot's Life in the Human's World: Toward Incremental Individual Recognition

2007年4月03日 00:00:00 GMT

A Few Days of A Robot's Life in the Human's World: Toward Incremental Individual Recognition Aryananda, Lijin This thesis presents an integrated framework and implementation for Mertz, an expressive robotic creature for exploring the task of face recognition through natural interaction in an incremental and unsupervised fashion. The goal of this thesis is to advance toward a framework which would allow robots to incrementally ``get to know'' a set of familiar individuals in a natural and extendable way. This thesis is motivated by the increasingly popular goal of integrating robots in the home. In order to be effective in human-centric tasks, the robots must be able to not only recognize each family member, but also to learn about the roles of various people in the household.In this thesis, we focus on two particular limitations of the current technology. Firstly, most of face recognition research concentrate on the supervised classification problem. Currently, one of the biggest problems in face recognition is how to generalize the system to be able to recognize new test data that vary from the training data. Thus, until this problem is solved completely, the existing supervised approaches may require multiple manual introduction and labelling sessions to include training data with enough variations. Secondly, there is typically a large gap between research prototypes and commercial products, largely due to lack of robustness and scalability to different environmental settings.In this thesis, we propose an unsupervised approach which wouldallow for a more adaptive system which can incrementally update thetraining set with more recent data or new individuals over time.Moreover, it gives the robots a more natural {\em socialrecognition} mechanism to learn not only to recognize each person'sappearance, but also to remember some relevant contextualinformation that the robot observed during previous interactionsessions. Therefore, this thesis focuses on integrating anunsupervised and incremental face recognition system within aphysical robot which interfaces directly with humans through naturalsocial interaction. The robot autonomously detects, tracks, andsegments face images during these interactions and automaticallygenerates a training set for its face recognition system. Moreover,in order to motivate robust solutions and address scalabilityissues, we chose to put the robot, Mertz, in unstructured publicenvironments to interact with naive passersby, instead of with onlythe researchers within the laboratory environment.While an unsupervised and incremental face recognition system is acrucial element toward our target goal, it is only a part of thestory. A face recognition system typically receives eitherpre-recorded face images or a streaming video from a static camera.As illustrated an ACLU review of a commercial face recognitioninstallation, a security application which interfaces with thelatter is already very challenging. In this case, our target goalis a robot that can recognize people in a home setting. Theinterface between robots and humans is even more dynamic. Both therobots and the humans move around.We present the robot implementation and its unsupervised incremental face recognition framework. We describe analgorithm for clustering local features extracted from a large set of automatically generated face data. We demonstrate the robot's capabilities and limitations in a series of experiments at a public lobby. In a final experiment, the robot interacted with a few hundred individuals in an eight day period and generated a training set of over a hundred thousand face images. We evaluate the clustering algorithm performance across a range of parameters on this automatically generated training data and also the Honda-UCSD video face database. Lastly, we present some recognition results using the self-labelled clusters. PhD thesis

Discriminative Gaussian Process Latent Variable Model for Classification

2007年3月28日 00:00:00 GMT

Discriminative Gaussian Process Latent Variable Model for Classification Urtasun, Raquel; Darrell, Trevor Supervised learning is difficult with high dimensional input spacesand very small training sets, but accurate classification may bepossible if the data lie on a low-dimensional manifold. GaussianProcess Latent Variable Models can discover low dimensional manifoldsgiven only a small number of examples, but learn a latent spacewithout regard for class labels. Existing methods for discriminativemanifold learning (e.g., LDA, GDA) do constrain the class distributionin the latent space, but are generally deterministic and may notgeneralize well with limited training data. We introduce a method forGaussian Process Classification using latent variable models trainedwith discriminative priors over the latent space, which can learn adiscriminative latent space from a small training set.

Combined Static and Dynamic Mutability Analysis

2007年3月23日 00:00:00 GMT

Combined Static and Dynamic Mutability Analysis Artzi, Shay; Kiezun, Adam; Glasser, David; Ernst, Michael D. Knowing which method parameters may be mutated during a method's execution is useful for many software engineering tasks. We present an approach to discovering parameter immutability, in which several lightweight, scalable analyses are combined in stages, with each stage rening the overall result. The resulting analysis is scalable and combines the strengths of its component analyses. As one of the component analyses, we present a novel, dynamic mutability analysis and show how its results can be improved by random input generation. Experimental results on programs of up to 185 kLOC show that, compared to previous approaches, our approach increases both scalability and overall accuracy.

Phonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures

2007年3月21日 00:00:00 GMT

Phonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures Rifkin, Ryan; Bouvrie, Jake; Schutte, Ken; Chikkerur, Sharat; Kouh, Minjoon; Ezzat, Tony; Poggio, Tomaso A preliminary set of experiments are described in which a biologically-inspired computer vision system (Serre, Wolf et al. 2005; Serre 2006; Serre, Oliva et al. 2006; Serre, Wolf et al. 2006) designed for visual object recognition was applied to the task of phonetic classification. During learning, the systemprocessed 2-D wideband magnitude spectrograms directly as images, producing a set of 2-D spectrotemporal patch dictionaries at different spectro-temporal positions, orientations, scales, and of varying complexity. During testing, features were computed by comparing the stored patches with patches fromnovel spectrograms. Classification was performed using a regularized least squares classifier (Rifkin, Yeo et al. 2003; Rifkin, Schutte et al. 2007) trained on the features computed by the system. On a 20-classTIMIT vowel classification task, the model features achieved a best result of 58.74% error, compared to 48.57% error using state-of-the-art MFCC-based features trained using the same classifier. This suggests that hierarchical, feed-forward, spectro-temporal patch-based architectures may be useful for phonetic analysis.

Object and Reference Immutability using Java Generics

2007年3月16日 00:00:00 GMT

Object and Reference Immutability using Java Generics Zibin, Yoav; Potanin, Alex; Artzi, Shay; Kiezun, Adam; Ernst, Michael D. A compiler-checked immutability guarantee provides useful documentation, facilitates reasoning, and enables optimizations. This paper presents Immutability Generic Java (IGJ), a novel language extension that expresses immutability without changing Java’s syntax by building upon Java’s generics and annotation mechanisms. In IGJ, each class has one additional generic parameter that is Immutable, Mutable, or ReadOnly. IGJ guarantees both reference immutability (only mutable references can mutate an object) and object immutability (an immutable reference points to an immutable object). IGJ is the first proposal for enforcing object immutability, and its reference immutability is more expressive than previous work. IGJ also permits covariant changes of generic arguments in a type-safe manner, e.g., a readonly list of integers is a subtype of a readonly list of numbers. IGJ extends Java’s type system with a few simple rules. We formalize this type system and prove it sound. Our IGJ compiler works by type-erasure and generates byte-code that can be executed on any JVM without runtime penalty.

Building Spatial Computers

2007年3月14日 00:00:00 GMT

Building Spatial Computers Bachrach, Jonathan; Beal, Jacob Programmability is a major challenge in spatial computing, anaggregate control problem found in domains such as sensor networks,swarm robotics, and modular robotics. We address this challenge witha model of a spatial computer (Proto Abstract Machine) and adistributed operating system, ProtoKernel, which implements PAMapproximately. ProtoKernel has been demonstrated on platforms inthree spatial computing domains: sensor networks, swarm robotics, andmodular robotics.

A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex

2005年12月19日 00:00:00 GMT

A Theory of Object Recognition: Computations and Circuits in the Feedforward Path of the Ventral Stream in Primate Visual Cortex Serre, T.; Kouh, M.; Cadieu, C.; Knoblich, U.; Kreiman, G.; Poggio, Tomaso A We describe a quantitative theory to account for the computations performed by the feedforward path of the ventral stream of visual cortex and the local circuits implementing them. We show that a model instantiating the theory is capable of performing recognition on datasets of complex images at the level of human observers in rapid categorization tasks. We also show that the theory is consistent with (and in some case has predicted) several properties of neurons in V1, V4, IT and PFC. The theory seems sufficiently comprehensive, detailed and satisfactory to represent an interesting challenge for physiologists and modelers: either disprove its basic features or propose alternative theories of equivalent scope. The theory suggests a number of open questions for visual physiology and psychophysics.

Distributed Method Selection and Dispatching of Contingent, Temporally Flexible Plans

2007年3月05日 00:00:00 GMT

Distributed Method Selection and Dispatching of Contingent, Temporally Flexible Plans Block, Stephen Many applications of autonomous agents require groups to work in tight coordination. To be dependable, these groups must plan, carry out and adapt their activities in a way that is robust to failure and to uncertainty. Previous work developed contingent, temporally flexible plans. These plans provide robustness to uncertain activity durations, through flexible timing constraints, and robustness to plan failure, through alternate approaches to achieving a task. Robust execution of contingent, temporally flexible plans consists of two phases. First, in the plan extraction phase, the executive chooses between the functionally redundant methods in the plan to select an execution sequence that satisfies the temporal bounds in the plan. Second, in the plan execution phase, the executive dispatches the plan, using the temporal flexibility to schedule activities dynamically.Previous contingent plan execution systems use a centralized architecture in which a single agent conducts planning for the entire group. This can result in a communication bottleneck at the time when plan activities are passed to the other agents for execution, and state information is returned. Likewise, a computation bottleneck may also occur because a single agent conducts all processing.This thesis introduces a robust, distributed executive for temporally flexible plans, called Distributed-Kirk, or D-Kirk. To execute a plan, D-Kirk first distributes the plan between the participating agents, by creating a hierarchical ad-hoc network and by mapping the plan onto this hierarchy. Second, the plan is reformulated using a distributed, parallel algorithm into a form amenable to fast dispatching. Finally, the plan is dispatched in a distributed fashion.We then extend the D-Kirk distributed executive to handle contingent plans. Contingent plans are encoded as Temporal Plan Networks (TPNs), which use a non-deterministic choice operator to compose temporally flexible plan fragments into a nested hierarchy of contingencies. A temporally consistent plan is extracted from the TPN using a distributed, parallel algorithm that exploits the structure of the TPN.At all stages of D-Kirk, the communication load is spread over all agents, thus eliminating the communication bottleneck. In particular, D-Kirk reduces the peak communication complexity of the plan execution phase by a factor of O(A/e'), where e' is the number of edges per node in the dispatchable plan, determined by the branching factor of the input plan, and A is the number of agents involved in executing the plan.In addition, the distributed algorithms employed by D-Kirk reduce the computational load on each agent and provide opportunities for parallel processing, thus increasing efficiency. In particular, D-Kirk reduces the average computational complexity of plan dispatching from O(eN^3) in the centralized case, to typical values of O(eN^2) per node and O(eN^3/A) per agent in the distributed case, where N is the number of nodes in the plan and e is the number of edges per node in the input plan.Both of the above results were confirmed empirically using a C++ implementation of D-Kirk on a set of parameterized input plans. The D-Kirk implementation was also tested in a realistic application where it was used to control a pair of robotic manipulators involved in a cooperative assembly task. SM thesis

Sensitive Manipulation

2007年3月02日 00:00:00 GMT

Sensitive Manipulation Torres-Jara, Eduardo This thesis presents an effective alternative to the traditionalapproach to robotic manipulation. In our approach, manipulation ismainly guided by tactile feedback as opposed to vision. Themotivation comes from the fact that manipulating an object impliescoming in contact with it, consequently, directly sensing physicalcontact seems more important than vision to control theinteraction of the object and the robot. In this work, thetraditional approach of a highly precise arm and vision systemcontrolled by a model-based architecture is replaced by one thatuses a low mechanical impedance arm with dense tactile sensing andexploration capabilities run by a behavior-based architecture.The robot OBRERO has been built to implement this approach. Newtactile sensing technology has been developed and mounted on therobot's hand. These sensors are biologically inspired and presentmore adequate features for manipulation than those of state of theart tactile sensors. The robot's limb was built with compliantactuators, which present low mechanical impedance, to make theinteraction between the robot and the environment safer than thatof a traditional high-stiffness arm. A new actuator was created tofit in the hand size constraints. The reduced precision ofOBRERO's limb is compensated by the capability of explorationgiven by the tactile sensors, actuators and the softwarearchitecture.The success of this approach is shown by picking up objects in anunmodelled environment. This task, simple for humans, has been achallenge for robots. The robot can deal with new, unmodelledobjects. OBRERO can come gently in contact, explore, lift, andplace the object in a different location. It can also detectslippage and external forces acting on an object while it is held.Each one of these steps are done by using tactile feedback. Thistask can be done with very light objects with no fixtures and onslippery surfaces. PhD thesis

Trading Structure for Randomness in Wireless Opportunistic Routing

2007年2月23日 00:00:00 GMT

Trading Structure for Randomness in Wireless Opportunistic Routing Chachulski, Szymon; Jennings, Michael; Katti, Sachin; Katabi, Dina Opportunistic routing is a recent technique that achieves high throughput in the face of lossy wireless links. The current opportunistic routing protocol, ExOR, ties the MAC with routing, imposing a strict schedule on routers' access to the medium. Although the scheduler delivers opportunistic gains, it misses some of the inherent features of the 802.11 MAC. For example, it prevents spatial reuse and thus may underutilize the wireless medium. It also eliminates the layering abstraction, making the protocol less amenable to extensions of alternate traffic type such as multicast.This paper presents MORE, a MAC-independent opportunistic routing protocol. MORE randomly mixes packets before forwarding them. This randomness ensures that routers that hear the same transmission do not forward the same packets. Thus, MORE needs no special scheduler to coordinate routers and can run directly on top of 802.11. Experimental results from a 20-node wireless testbed show that MORE's average unicast throughput is 20% higher than ExOR, and the gains rise to 50% over ExOR when there is a chance of spatial reuse. For multicast, MORE's gains increase with the number of destinations, and are 35-200% greater than ExOR.

Information Slicing: Anonymity Using Unreliable Overlays

2007年2月23日 00:00:00 GMT

Information Slicing: Anonymity Using Unreliable Overlays Katti, Sachin; Cohen, Jeffrey; Katabi, Dina This paper proposes a new approach to anonymous communication called information slicing. Typically, anonymizers use onion routing, where a message is encrypted in layers with the public keys of the nodes along the path. Instead, our approach scrambles the message, divides it into pieces, and sends the pieces along disjoint paths. We show that information slicing addresses message confidentiality as well as source and destination anonymity. Surprisingly, it does not need any public key cryptography. Further, our approach naturally addresses the problem of node failures. These characteristics make it a good fit for use over dynamic peer-to-peer overlays. We evaluate the anonymity ofinformation slicing via analysis and simulations. Our prototype implementation on PlanetLab shows that it achieves higher throughput than onion routing and effectively copes with node churn.

Embracing Wireless Interference: Analog Network Coding

2007年2月23日 00:00:00 GMT

Embracing Wireless Interference: Analog Network Coding Katti, Sachin; Gollakota, Shyamnath; Katabi, Dina Traditionally, interference is considered harmful.Wireless networks strive to avoid scheduling multiple transmissions at the same time in order to prevent interference. This paper adopts the opposite approach; it encourages strategically picked senders to interfere. Instead of forwarding packets,routers forward the interfering signals. The destination leverages network-level information to cancel the interference and recover the signal destined to it. The result is analog network coding because it codes signals not bits. So, what if wireless routers forward signals instead of packets? Theoretically, we prove that such an approach doubles the capacity of the canonical relay network. Surprisingly, it is also practical. We implement our design using softwareradios and show that it achieves significantly higher throughput than both traditional wireless routing and prior work on wireless network coding.

Using Task-Structured Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol

2007年2月16日 00:00:00 GMT

Using Task-Structured Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol Canetti, Ran; Cheung, Ling; Kaynar, Dilsun; Liskov, Moses; Lynch, Nancy; Pereira, Olivier; Segala, Roberto The Probabilistic I/O Automata framework of Lynch, Segala and Vaandrager provides tools for precisely specifying protocols and reasoning about their correctness using multiple levels of abstraction, based on implementation relationships between these levels. We enhance this framework to allow analyzing protocols that use cryptographic primitives. This requires resolving and reconciling issues such as nondeterministic behavior and scheduling, randomness, resource-bounded computation, and computational hardness assumptions. The enhanced framework allows for more rigorous and systematic analysis of cryptographic protocols. To demonstrate the use of this framework, we present an example analysis that we have done for an Oblivious Transfer protocol.

Automatic shaping and decomposition of reward functions

2007年2月13日 00:00:00 GMT

Automatic shaping and decomposition of reward functions Marthi, Bhaskara This paper investigates the problem of automatically learning how torestructure the reward function of a Markov decision process so as tospeed up reinforcement learning. We begin by describing a method thatlearns a shaped reward function given a set of state and temporalabstractions. Next, we consider decomposition of the per-timestepreward in multieffector problems, in which the overall agent can bedecomposed into multiple units that are concurrently carrying outvarious tasks. We show by example that to find a good rewarddecomposition, it is often necessary to first shape the rewardsappropriately. We then give a function approximation algorithm forsolving both problems together. Standard reinforcement learningalgorithms can be augmented with our methods, and we showexperimentally that in each case, significantly faster learningresults.

PPR: Partial Packet Recovery for Wireless Networks

2007年2月02日 00:00:00 GMT

PPR: Partial Packet Recovery for Wireless Networks Jamieson, Kyle; Balakrishnan, Hari Bit errors occur over wireless channels when the signal isn't strongenough to overcome the effects of interference and noise. Currentwireless protocols may use forward error correction (FEC) to correct forsome (small) number of bit errors, but generally retransmit the wholepacket if the FEC is insufficient. We observe that current wirelessmesh network protocols retransmit a number of packets and that most ofthese retransmissions end up sending bits that have already beenreceived multiple times, wasting network capacity. To overcome thisinefficiency, we develop, implement, and evaluate a partial packetrecovery (PPR) system.PPR incorporates three new ideas: (1) SoftPHY, an expandedphysical layer (PHY) interface to provide hints to the higher layersabout how ``close'' the actual received symbol was to the one decoded,(2) a postamble scheme to recover data even when a packet'spreamble is corrupted and not decodable at the receiver, and (3) PP-ARQ, an asynchronous link-layer retransmission protocol that allowsa receiver to compactly encode and request for retransmission only thoseportions of a packet that are likely in error.Our experimental results from a 27-node 802.15.4 testbed that includesTelos motes with 2.4 GHz Chipcon radios and GNU Radio nodes implementingthe Zigbee standard (802.15.4) show that PPR increases the framedelivery rate by a factor of 2x under moderate load, and7x under heavy load when many links have marginal quality.

HQ Replication: Properties and Optimizations

2007年2月12日 00:00:00 GMT

HQ Replication: Properties and Optimizations Cowling, James; Myers, Daniel; Liskov, Barbara; Rodrigues, Rodrigo; Shrira, Liuba There are currently two approaches to providing Byzantine-fault-tolerant state machine replication: a replica-based approach, e.g., BFT, that uses communication between replicas to agree on a proposed ordering of requests, and a quorum-based approach, such as Q/U, in which clients contact replicas directly to optimistically execute operations. Both approaches have shortcomings: the quadratic cost of inter-replica communication is unnecessary when there is no contention, and Q/U requires a large number of replicas and performs poorly under contention.We present HQ, a hybrid Byzantine-fault-tolerant state machine replication protocol that overcomes these problems. HQ employs a lightweight quorum-based protocol when there is no contention, but uses BFT to resolve contention when it arises. Furthermore, HQ uses only 3f+1 replicas to tolerate f faults, providing optimal resilience to node failures.We implemented a prototype of HQ, and we compare its performance to BFT and Q/U analytically and experimentally. Additionally, in this work we use a new implementation of BFT designed to scale as the number of faults increases. Our results show that both HQ and our new implementation of BFT scale as f increases; additionally our hybrid approach of using BFT to handle contention works well.

Phonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures

2007年2月01日 00:00:00 GMT

Phonetic Classification Using Hierarchical, Feed-forward, Spectro-temporal Patch-based Architectures Rifkin, Ryan; Bouvrie, Jake; Schutte, Ken; Chikkerur, Sharat; Kouh, Minjoon; Ezzat, Tony; Poggio, Tomaso A preliminary set of experiments are described in which a biologically-inspired computer vision system (Serre, Wolf et al. 2005; Serre 2006; Serre, Oliva et al. 2006; Serre, Wolf et al. 2006) designed for visual object recognition was applied to the task of phonetic classification. During learning, the systemprocessed 2-D wideband magnitude spectrograms directly as images, producing a set of 2-D spectrotemporal patch dictionaries at different spectro-temporal positions, orientations, scales, and of varying complexity. During testing, features were computed by comparing the stored patches with patches fromnovel spectrograms. Classification was performed using a regularized least squares classifier (Rifkin, Yeo et al. 2003; Rifkin, Schutte et al. 2007) trained on the features computed by the system. On a 20-class TIMIT vowel classification task, the model features achieved a best result of 58.74% error, compared to 48.57% error using state-of-the-art MFCC-based features trained using the same classifier. This suggests that hierarchical, feed-forward, spectro-temporal patch-based architectures may be useful for phoneticanalysis.

Explorations in Low-Cost Compliant Robotics

2007年1月30日 00:00:00 GMT

Explorations in Low-Cost Compliant Robotics Kumpf, Adam This thesis presents the findings of exploratory research in low-cost compliant robotics. The most heavily leveraged trade-off is that of mechanical precision for computational power, with the hope that the price of future computation will continue to fall exponentially while the expected price of precision mechanical parts will remain relatively constant. The most novel contribution of this research is the Torsionally Compliant Elastomer Joint (TCEJ) which allows for compliance and sensing in a very small package while using extremely inexpensive components. Computational modeling of hysteresis, signal compression, and backlash are also explored to compensate for the non-idealities often found in cheap mechanical parts. Three proof-of-concept systems are described along with a set of experiments used to test their capabilities. Finally, future work is proposed that will likely shape the next generation of low-cost compliant robotics. MEng thesis

Online Active Learning in Practice

2007年1月23日 00:00:00 GMT

Online Active Learning in Practice Monteleoni, Claire; Kaariainen, Matti We compare the practical performance of several recently proposed algorithms for active learning in the online setting. We consider two algorithms (and their combined variants) that are strongly online, in that they do not store any previously labeled examples, and for which formal guarantees have recently been proven under various assumptions. We perform an empirical evaluation on optical character recognition (OCR) data, an application that we argue to be appropriately served by online active learning. We compare the performance between the algorithm variants and show significant reductions in label-complexity over random sampling.

Robot Manipulation in Human Environments

2007年1月16日 00:00:00 GMT

Robot Manipulation in Human Environments Edsinger, Aaron Human environments present special challenges for robot manipulation. They are often dynamic, difficult to predict, and beyond the control of a robot engineer. Fortunately, many characteristics of these settings can be used to a robot's advantage. Human environments are typically populated by people, and a robot can rely on the guidance and assistance of a human collaborator. Everyday objects exhibit common, task-relevant features that reduce the cognitive load required for the object's use. Many tasks can be achieved through the detection and control of these sparse perceptual features. And finally, a robot is more than a passive observer of the world. It can use its body to reduce its perceptual uncertainty about the world.In this thesis we present advances in robot manipulation that address the unique challenges of human environments. We describe the design of a humanoid robot named Domo, develop methods that allow Domo to assist a person in everyday tasks, and discuss general strategies for building robots that work alongside people in their homes and workplaces. PhD thesis

Scale Control Processor Test-Chip

2007年1月12日 00:00:00 GMT

Scale Control Processor Test-Chip Batten, Christopher; Krashinsky, Ronny; Asanovic, Krste We are investigating vector-thread architectures which provide competitive performance and efficiency across a broad class of application domains. Vector-thread architectures unify data-level, thread-level, and instruction-level parallelism, providing new ways of parallelizing codes that are difficult to vectorize or that incur excessive synchronization costs when multithreaded. To illustrate these ideas we have developed the Scale processor, which is an example of a vector-thread architecture designed for low-power and high-performance embedded systems. The prototype includes a single-issue 32-bit RISC control processor, a vector-thread unit which supports up to 128 virtual processor threads and can execute up to 16 instructions per cycle, and a 32 KB shared primary cache.Since the Scale Vector-Thread Processor is a large and complex design (especially for an academic project), we first designed and fabricated the Scale Test Chip (STC1). STC1 includes a simplified version of the Scale control processor, 8 KB of RAM, a host interface, and a custom clock generator. STC1 helped mitigate the risk involved in fabricating the full Scale chip in several ways. First, we were able to establish and test our CAD toolflow. Our toolflow included several custom tools which had not previously been used in any tapeouts. Second, we were able to better characterize our target package and process. For example, STC1 enabled us to better correlate the static timing numbers from our CAD tools with actual silicon and also to characterize the expected rise/fall times of our external signal pins. Finally, STC1 allowed us to test our custom clock generator. We used our experiences with STC1 to help us implement the Scale vector-thread processor. Scale was taped out on October 15, 2006 and it is currently being fabricated through MOSIS. This report discusses the fabrication of STC1 and presents power and performance results.

Latent-Dynamic Discriminative Models for Continuous Gesture Recognition

2007年1月07日 00:00:00 GMT

Latent-Dynamic Discriminative Models for Continuous Gesture Recognition Morency, Louis-Philippe; Quattoni, Ariadna; Darrell, Trevor Many problems in vision involve the prediction of a class label for each frame in an unsegmented sequence. In this paper we develop a discriminative framework for simultaneous sequence segmentation and labeling which can capture both intrinsic and extrinsic class dynamics. Our approach incorporates hidden state variables which model the sub-structure of a class sequence and learn the dynamics between class labels. Each class label has a disjoint set of associated hidden states, which enables efficient training and inference in our model. We evaluated our method on the task of recognizing human gestures from unsegmented video streams and performed experiments on three different datasets of head and eye gestures. Our results demonstrate that our model for visual gesture recognition outperform models based on Support Vector Machines, Hidden Markov Models, and Conditional Random Fields.

Quantifier-Free Boolean Algebra with Presburger Arithmetic is NP-Complete

2007年1月01日 00:00:00 GMT

Quantifier-Free Boolean Algebra with Presburger Arithmetic is NP-Complete Kuncak, Viktor Boolean Algebra with Presburger Arithmetic (BAPA) combines1) Boolean algebras of sets of uninterpreted elements (BA)and 2) Presburger arithmetic operations (PA). BAPA canexpress the relationship between integer variables andcardinalities of unbounded finite sets and can be used toexpress verification conditions in verification of datastructure consistency properties.In this report I consider the Quantifier-Free fragment ofBoolean Algebra with Presburger Arithmetic (QFBAPA).Previous algorithms for QFBAPA had non-deterministicexponential time complexity. In this report I show thatQFBAPA is in NP, and is therefore NP-complete. My resultyields an algorithm for checking satisfiability of QFBAPAformulas by converting them to polynomially sized formulasof quantifier-free Presburger arithmetic. I expect thisalgorithm to substantially extend the range of QFBAPAproblems whose satisfiability can be checked in practice.

Bounded CCA2-Secure Non-Malleable Encryption

2006年12月14日 00:00:00 GMT

Bounded CCA2-Secure Non-Malleable Encryption Pass, Rafael; Shelat, Abhi; Vaikuntanathan, Vinod Under an adaptive chosen ciphertext attack (CCA2), the security of an encryption scheme must hold against adversaries that have access to a decryption oracle. We consider a weakening of CCA2 security, wherein security need only hold against adversaries making an a-priori bounded number of queries to the decryption oracle. Concerning this notion, which we call bounded-CCA2 security, we show the following two results. (1) Bounded-CCA2 secure non-malleable encryption schemes exist if and only if semantically-secure (IND-CPA-secure) encryption schemes exist.(As far as we know, bounded-CCA2 non-malleability is the strongest notion of security known to be satisfiable assuming only the existence of semantically-secure encryption schemes.) (2) In contrast to CCA2 security, bounded-CCA2 security alone does not imply non-malleability. In particular, if there exists an encryption scheme that is bounded-CCA2 secure, then there exists another encryption scheme which remains bounded-CCA2 secure, but is malleable under a simple chosen-plaintext attack.

Memoization Attacks and Copy Protection in Partitioned Applications

2006年12月08日 00:00:00 GMT

Memoization Attacks and Copy Protection in Partitioned Applications O'Donnell, Charles W.; Suh,, G. Edward; Dijk, Marten vn; Devadas, Srinivas Application source code protection is a major concern for software architects today. Secure platforms have been proposed that protect the secrecy of application algorithms and enforce copy protection assurances. Unfortunately, these capabilities incur a sizeable performance overhead. Partitioning an application into secure and insecure regions can help diminish these overheads but invalidates guarantees of code secrecy and copy protection.This work examines one of the problems of partitioning an application into public and private regions, the ability of an adversary to recreate those private regions. To our knowledge, it is the first to analyze this problem when considering application operation as a whole. Looking at the fundamentals of the issue, we analyze one of the simplest attacks possible, a ``Memoization Attack.'' We implement an efficient Memoization Attack and discuss necessary techniques that limit storage and computation consumption. Experimentation reveals that certain classes of real-world applications are vulnerable to Memoization Attacks. To protect against such an attack, we propose a set of indicator tests that enable an application designer to identify susceptible application code regions.

Distributed Area Search with a Team of Robots

2006年12月05日 00:00:00 GMT

Distributed Area Search with a Team of Robots Tzanov, Velin K. The main goal of this thesis is to demonstrate the applicability of the distributed systems paradigm to robotic systems. This goal is accomplished by presenting two solutions to the Distributed Area Search problem: organizing a team of robots to collaborate in the task of searching through an area. The first solution is designed for unreliable robots equipped with a reliable GPS-style localization system. This solution demonstrates the efficiency and fault-tolerance of this type of distributed robotic systems, as well as their applicability to the real world. We present a theoretically near-optimal algorithm for solving Distributed Area Search under this setting, and we also present an implementation of our algorithm on an actual system, consisting of twelve robots. The second solution is designed for a completely autonomous system, without the aid of any centralized subsystem. It demonstrates how a distributed robotic system can solve a problem that is practically unsolvable for a single-robot system. MEng thesis

Materialization Strategies in a Column-Oriented DBMS

2006年11月27日 00:00:00 GMT

Materialization Strategies in a Column-Oriented DBMS Abadi, Daniel J.; Myers, Daniel S.; DeWitt, David J.; Madden, Samuel R. There has been renewed interest in column-oriented database architectures in recent years. For read-mostly query workloads such as those found in data warehouse and decision support applications, ``column-stores'' have been shown to perform particularly well relative to ``row-stores.'' In order for column-stores to be readily adopted as a replacement for row-stores, however, they must present the same interface to client applications as do row stores, which implies that they must output row-store-style tuples.Thus, the input columns stored on disk must be converted to rows at some point in the query plan, but the optimal point at which to do the conversion is not obvious. This problem can be considered as the opposite of the projection problem in row-store systems: while row-stores need to determine where in query plans to place projection operators to make tuples narrower, column-stores need to determine when to combine single-column projections into wider tuples. This paper describes a variety of strategies for tuple construction and intermediate result representations and provides a systematic evaluation of these strategies.

Scoop: An Adaptive Indexing Scheme for Stored Data in Sensor Networks

2006年11月27日 00:00:00 GMT

Scoop: An Adaptive Indexing Scheme for Stored Data in Sensor Networks Gil, Thomer M.; Madden, Samuel In this paper, we present the design of Scoop, a system for indexing and querying stored data in sensor networks. Scoop works by collecting statistics about the rate of queries and distribution of sensor readings over a sensor network, and uses those statistics to build an index that tells nodes where in the network to store their readings. Using this index, a user’s queries over that stored data can be answered efficiently, without ﬂooding those queries throughout the network. This approach offers a substantial advantage over other solutions that either store all data externally on a basestation (requiring every reading to be collected from all nodes), or that store all data locally on the node that produced it (requiring queries to be ﬂooded throughout the network). Our results, in fact, show that Scoop offers a factor of four improvement over existing techniques in a real implementation on a 64-node mote-based sensor network. These results also show that Scoop is able to efficciently adapt to changes in the distribution and rates of data and queries.

Context-based Visual Feedback Recognition

2006年11月15日 00:00:00 GMT

Context-based Visual Feedback Recognition Morency, Louis-Philippe During face-to-face conversation, people use visual feedback (e.g.,head and eye gesture) to communicate relevant information and tosynchronize rhythm between participants. When recognizing visualfeedback, people often rely on more than their visual perception.For instance, knowledge about the current topic and from previousutterances help guide the recognition of nonverbal cues. The goal ofthis thesis is to augment computer interfaces with the ability toperceive visual feedback gestures and to enable the exploitation ofcontextual information from the current interaction state to improvevisual feedback recognition.We introduce the concept of visual feedback anticipationwhere contextual knowledge from an interactive system (e.g. lastspoken utterance from the robot or system events from the GUIinterface) is analyzed online to anticipate visual feedback from ahuman participant and improve visual feedback recognition. Ourmulti-modal framework for context-based visual feedback recognitionwas successfully tested on conversational and non-embodiedinterfaces for head and eye gesture recognition.We also introduce Frame-based Hidden-state Conditional RandomField model, a new discriminative model for visual gesturerecognition which can model the sub-structure of a gesture sequence,learn the dynamics between gesture labels, and can be directlyapplied to label unsegmented sequences. The FHCRF model outperformsprevious approaches (i.e. HMM, SVM and CRF) for visual gesturerecognition and can efficiently learn relevant contextualinformation necessary for visual feedback anticipation.A real-time visual feedback recognition library for interactiveinterfaces (called Watson) was developed to recognize head gaze,head gestures, and eye gaze using the images from a monocular orstereo camera and the context information from the interactivesystem. Watson was downloaded by more then 70 researchers around theworld and was successfully used by MERL, USC, NTT, MIT Media Lab andmany other research groups. PhD thesis

Quantitative Information-Flow Tracking for C and Related Languages

2006年11月17日 00:00:00 GMT

Quantitative Information-Flow Tracking for C and Related Languages McCamant, Stephen; Ernst, Michael D. We present a new approach for tracking programs' use of data througharbitrary calculations, to determine how much information about secretinputs is revealed by public outputs. Using a fine-grained dynamicbit-tracking analysis, the technique measures the information revealedduring a particular execution. The technique accounts for indirectflows, e.g. via branches and pointer operations. Two kinds ofuntrusted annotation improve the precision of the analysis. Animplementation of the technique based on dynamic binary translation isdemonstrated on real C, C++, and Objective C programs of up to half amillion lines of code. In case studies, the tool checked multiplesecurity policies, including one that was violated by a previouslyunknown bug.

A Fast Approximation of the Bilateral Filter using a Signal Processing Approach

2006年11月09日 00:00:00 GMT

A Fast Approximation of the Bilateral Filter using a Signal Processing Approach Paris, Sylvain; Durand, Fredo The bilateral filter is a nonlinear filter that smoothes a signal while preserving strong edges. It has demonstrated great effectiveness for a variety of problems in computer vision and computer graphics, and fast versions have been proposed. Unfortunately, little is known about the accuracy of such accelerations. In this paper, we propose a new signal-processing analysis of the bilateral filter which complements the recent studies that analyzed it as a PDE or as a robust statistical estimator. The key to our analysis is to express the filter in a higher-dimensional space where the signal intensity is added to the original domain dimensions. Importantly, this signal-processing perspective allows us to develop a novel bilateral filtering acceleration using downsampling in space and intensity. This affords a principled expression of accuracy in terms of bandwidth and sampling. The bilateral filter can be expressed as linear convolutions in this augmented space followed by two simple nonlinearities. This allows us to derive criteria for downsampling the key operations and achieving important acceleration of the bilateral filter. We show that, for the same running time, our method is more accurate than previous acceleration techniques. Typically, we are able to process a 2~megapixel image using our acceleration technique in less than a second, and have the result be visually similar to the exact computation that takes several tens of minutes. The acceleration is most effective with large spatial kernels. Furthermore, this approach extends naturally to color images and cross bilateral filtering.

On the Adaptive Real-Time Detection of Fast-Propagating Network Worms

2006年11月10日 00:00:00 GMT

On the Adaptive Real-Time Detection of Fast-Propagating Network Worms Jung, Jaeyeon; Milito, Rodolfo A.; Paxson, Vern We present two light-weight worm detection algorithms thatoffer significant advantages over fixed-threshold methods.The first algorithm, RBS (rate-based sequential hypothesis testing)aims at the large class of worms that attempts to quickly propagate, thusexhibiting abnormal levels of the rate at which hosts initiateconnections to new destinations. The foundation of RBS derives fromthe theory of sequential hypothesis testing, the use of which fordetecting randomly scanning hosts was first introduced by our previouswork with the TRW (Threshold Random Walk) scan detection algorithm. The sequential hypothesistesting methodology enables engineering the detectors to meet falsepositives and false negatives targets, rather than triggering whenfixed thresholds are crossed. In this sense, the detectors that weintroduce are truly adaptive.We then introduce RBS+TRW, an algorithm that combines fan-out rate (RBS)and probability of failure (TRW) of connections to new destinations.RBS+TRW provides a unified framework that at one end acts as a pure RBSand at the other end as pure TRW, and extends RBS's power in detectingworms that scan randomly selected IP addresses.

On Using First-Order Theorem Provers in the Jahob Data Structure Verification System

2006年11月09日 00:00:00 GMT

On Using First-Order Theorem Provers in the Jahob Data Structure Verification System Bouillaguet, Charles; Kuncak, Viktor; Wies, Thomas; Zee, Karen; Rinard, Martin This paper presents our integration of efficient resolution-based theorem provers into the Jahob data structure verification system. Our experimental results show that this approach enables Jahob to automatically verify the correctness of a range of complex dynamically instantiable data structures, including data structures such as hash tables and search trees, without the need for interactive theorem proving or techniques tailored to individual data structures. Our primary technical results include: (1) a translation from higher-order logic to first-order logic that enables the application of resolution-based theorem provers and (2) a proof that eliminating type (sort) information in formulas is both sound and complete, even in the presence of a generic equality operator. Our experimental results show that the elimination of type information dramatically decreases the time required to prove the resulting formulas. These techniques enabled us to verify complex correctness properties of Java programs such as a mutable set implemented as an imperative linked list, a finite map implemented as a functional ordered tree, a hash table with a mutable array, and a simple library system example that uses these container data structures. Our system verifies (in a matter of minutes) that data structure operations correctly update the finite map, that they preserve data structure invariants (such as ordering of elements, membership in appropriate hash table buckets, or relationships between sets and relations), and that there are no run-time errors such as null dereferences or array out of bounds accesses.

Analogical Retrieval via Intermediate Features: The Goldilocks Hypothesis

2006年11月07日 00:00:00 GMT

Analogical Retrieval via Intermediate Features: The Goldilocks Hypothesis Finlayson, Mark Alan; Winston, Patrick Henry Analogical reasoning has been implicated in many important cognitive processes, such as learning, categorization, planning, and understanding natural language. Therefore, to obtain a full understanding of these processes, we must come to a better understanding of how people reason by analogy. Analogical reasoning is thought to occur in at least three stages: retrieval of a source description from memory upon presentation of a target description, mapping of the source description to the target description, and transfer of relationships from source description to target description. Here we examine the first stage, the retrieval of relevant sources from long-term memory for their use in analogical reasoning. Specifically we ask: what can people retrieve from long-term memory, and how do they do it?Psychological experiments show that subjects display two sorts of retrieval patterns when reasoning by analogy: a novice pattern and an expert pattern. Novice-like subjects are more likely to recall superficiallysimilar descriptions that are not helpful for reasoning by analogy. Conversely, expert-like subjects are more likely to recall structurally-related descriptions that are useful for further analogical reasoning. Previous computational models of the retrieval stage have only attempted to model novice-like retrieval. We introduce a computational model that can demonstrate both novice-like and expert-like retrieval with the same mechanism. The parameter of the model that is varied to produce these two types of retrieval is the average size of the features used to identify matches in memory. We find that, in agreement with an intuition from the work of Ullman and co-workers regarding the use of features in visual classification (Ullman, Vidal-Naquet,& Sali, 2002), that features of an intermediate size are most useful for analogical retrieval.We conducted two computational experiments on our own dataset of fourteen formally described stories, which showed that our model gives the strongest analogical retrieval, and is most expert-like, when it uses features that are on average of intermediate size. We conducted a third computational experiment on the Karla the Hawk dataset which showed a modest effect consistent with our predictions. Because our model and Ullman’s work both rely on intermediate-sized features to perform recognition-like tasks, we take both as supporting what we call the Goldilocks hypothesis: that on the average those features that are maximally useful for recognition are neither too small nor too large, neither too simple nor too complex, but rather are in the middle, of intermediate size and complexity.

Implementing Atomic Data through Indirect Learning in Dynamic Network

2006年10月12日 00:00:00 GMT

Implementing Atomic Data through Indirect Learning in Dynamic Network Konwar, K.; Musial, P.M.; Nicolau, N.C.; Shvartsman., A.A. Developing middleware services for dynamic distributed systems, e.g., ad-hoc networks, is a challenging task given that suchservices must deal with communicating devices that may join and leave the system, and fail or experience arbitrary delays. Algorithmsdeveloped for static settings are often not usable in dynamic settings because they rely on (logical) all-to-all connectivityor assume underlying routing protocols, which may be unfeasible in highly dynamic settings. This paper explores the indirectlearning approach to information dissemination within a dynamic distributed data service. The indirect learning scheme is usedto improve the liveness of the atomic read/write object service in the settings with uncertain connectivity. The service is formallyproved to be correct, i.e., the atomicity of the objects is guaranteed in all executions. Conditional analysis of the performanceof the new service is presented. This analysis has the potential of being generalized to other similar dynamic algorithms. Underthe assumption that the network is connected, and assuming reasonable timing conditions, the bounds on the duration of theread/write operations of the new service are calculated. Finally, the paper proposes a deployment strategy where indirect learningleads to an improvement in communication costs relative to a previous solution.

Programming a Sensor Network as an Amorphous Medium

2006年6月01日 00:00:00 GMT

Programming a Sensor Network as an Amorphous Medium Bachrach, Jonathan; Beal, Jacob In many sensor network applications, the network is deployedto approximate a physical space. The network itself is not ofinterest: rather, we are interested in measuring the propertiesof the space it fills, and of establishing control over thebehavior of that space.The spatial nature of sensor network applications meansthat many can be expressed naturally and succinctly in termsof the global behavior of an amorphous medium---a continuouscomputational material filling the space of interest. Althoughwe cannot construct such a material, we can approximateit using a sensor network.Using this amorphous medium abstraction separates sensornetwork problems into two largely independent domains.Above the abstraction barrier we are concerned with longrangecoordination and concise description of applications,while below the barrier we are concerned with fast, efficient,and robust communication between neighboring devices.We apply the amorphous medium abstraction with Proto,a high-level language for programming sensor/actuator networks.Existing applications, such as target tracking andthreat avoidance, can be expressed in only a few lines of Protocode. The applications are then compiled for execution on akernel that approximates an amorphous medium. Programswritten using our Proto implementation have been verified insimulation on over ten thousand nodes, as well as on a networkof Berkeley Motes.

The Design of a Relational Engine

2006年9月29日 00:00:00 GMT

The Design of a Relational Engine Torlak, Emina; Jackson, Daniel The key design challenges in the construction of a SAT-based relational engine are described, and novel techniques are proposed to address them. An efficient engine must have a mechanism for specifying partial solutions, an effective symmetry detection and breaking scheme, and an economical translation from relational to boolean logic. These desiderata are addressed with three new techniques: a symmetry detection algorithm that works in the presence of partial solutions, a sparse-matrix representation of relations, and a compact representation of boolean formulas inspired by boolean expression diagrams and reduced boolean circuits. The presented techniques have been implemented and evaluated, with promising results.

Adaptation for Regularization Operators in Learning Theory

2006年9月10日 00:00:00 GMT

Adaptation for Regularization Operators in Learning Theory Caponnetto, Andrea; Yao, Yuan We consider learning algorithms induced by regularization methods in the regression setting. We show that previously obtained error bounds for these algorithms using a-priori choices of the regularization parameter, can be attained using a suitable a-posteriori choice based on validation. In particular, these results prove adaptation of the rate of convergence of the estimators to the minimax rate induced by the "effective dimension" of the problem. We also show universal consistency for theses class methods.

Optimal Rates for Regularization Operators in Learning Theory

2006年9月10日 00:00:00 GMT

Optimal Rates for Regularization Operators in Learning Theory Caponnetto, Andrea We develop some new error bounds for learning algorithms induced by regularization methods in the regression setting. The "hardness" of the problem is characterized in terms of the parameters r and s, the first related to the "complexity" of the target function, the second connected to the effective dimension of the marginal probability measure over the input space. We show, extending previous results, that by a suitable choice of the regularization parameter as a function of the number of the available examples, it is possible attain the optimal minimax rates of convergence for the expected squared loss of the estimators, over the family of priors fulfilling the constraint r + s > 1/2. The setting considers both labelled and unlabelled examples, the latter being crucial for the optimality results on the priors in the range r < 1/2.

Ubiquitous Memory Introspection (Preliminary Manuscript)

2006年9月25日 00:00:00 GMT

Ubiquitous Memory Introspection (Preliminary Manuscript) Zhao, Qin; Rabbah, Rodric; Amarasinghe, Saman; Rudolph, Larry; Wong, Weng-Fai Modern memory systems play a critical role in the performance ofapplications, but a detailed understanding of the application behaviorin the memory system is not trivial to attain. It requires timeconsuming simulations of the memory hierarchy using long traces, andoften using detailed modeling. It is increasingly possible to accesshardware performance counters to measure events in the memory system,but the measurements remain coarse grained, better suited forperformance summaries than providing instruction level feedback. Theavailability of a low cost, online, and accurate methodology forderiving fine-grained memory behavior profiles can prove extremelyuseful for runtime analysis and optimization of programs.This paper presents a new methodology for Ubiquitous MemoryIntrospection (UMI). It is an online and lightweight mini-simulationmethodology that focuses on simulating short memory access tracesrecorded from frequently executed code regions. The simulations arefast and can provide profiling results at varying granularities, downto that of a single instruction or address. UMI naturally complementsruntime optimizations techniques and enables new opportunities formemory specific optimizations.In this paper, we present a prototype implementation of a runtimesystem implementing UMI. The prototype is readily deployed oncommodity processors, requires no user intervention, and can operatewith stripped binaries and legacy software. The prototype operateswith an average runtime overhead of 20% but this slowdown is only 6%slower than a state of the art binary instrumentation tool. We used32 benchmarks, including the full suite of SPEC2000 benchmarks, forour evaluation. We show that the mini-simulation results accuratelyreflect the cache performance of two existing memory systems, anIntel Pentium~4 and an AMD Athlon MP (K7) processor. We alsodemonstrate that low level profiling information from the onlinesimulation can serve to identify high-miss rate load instructions with a77% rate of accuracy compared to full offline simulations thatrequired days to complete. The online profiling results are used atruntime to implement a simple software prefetching strategy thatachieves a speedup greater than 60% in the best case.

RingScalar: A Complexity-Effective Out-of-Order Superscalar Microarchitecture

2006年9月18日 00:00:00 GMT

RingScalar: A Complexity-Effective Out-of-Order Superscalar Microarchitecture Tseng, Jessica H.; Asanovic, Krste RingScalar is a complexity-effective microarchitecture for out-of-order superscalar processors, that reduces the area, latency, and power of all major structures in the instruction flow. The design divides an N-way superscalar into N columns connected in a unidirectional ring, where each column contains a portion of the instruction window, a bank of the register file, and an ALU. The design exploits the fact that most decoded instructions are waiting on just one operand to use only a single tag per issue window entry, and to restrict instruction wakeup and value bypass to only communicate with the neighboring column. Detailed simulations of four-issue single-threaded machines running SPECint2000 show that RingScalar has IPC only 13% lower than an idealized superscalar, while providing large reductions in area, power, and circuit latency.

Combined static and dynamic mutability analysis

2006年9月17日 00:00:00 GMT

Combined static and dynamic mutability analysis Artzi, Shay; Ernst, Michael D.; Glasser, David; Kiezun, Adam Knowing which method parameters may be mutated during a method'sexecution is useful for many software engineering tasks. We presentan approach to discovering parameter immutability, in which severallightweight, scalable analyses are combined in stages, with each stagerefining the overall result. The resulting analysis is scalable andcombines the strengths of its component analyses. As one of thecomponent analyses, we present a novel, dynamic mutability analysisand show how its results can be improved by random input generation.Experimental results on programs of up to 185 kLOC demonstrate that,compared to previous approaches, our approach increases both scalabilityand overall accuracy.

Virtual Monotonic Counters and Count-Limited Objects using a TPM without a Trusted OS (Extended Version)

2006年9月11日 00:00:00 GMT

Virtual Monotonic Counters and Count-Limited Objects using a TPM without a Trusted OS (Extended Version) Sarmenta, Luis F. G.; van Dijk, Marten; O'Donnell, Charles W.; Rhodes, Jonathan; Devadas, Srinivas A trusted monotonic counter is a valuable primitive thatenables a wide variety of highly scalable offlineand decentralized applications that would otherwise be prone to replay attacks, including offline payment, e-wallets, virtual trusted storage, and digital rights management (DRM).In this paper, we show how one can implement a very large number of virtual monotonic counters on an untrusted machine with a Trusted Platform Module (TPM) or similar device, without relying on a trusted OS. We first present a log-based scheme that can be implemented with the current version of the TPM (1.2) and used incertain applications.We then show how the addition of a few simple features tothe TPM makes it possible to implement a hash-tree-based schemethat not only offers improved performance and scalability compared to the log-based scheme, but also makes it possible to implement count-limited objects (or ``clobs'' for short) -- i.e., encrypted keys, data, and other objectsthat can only be used when an associated virtual monotonic counter is within a certain range.Such count-limited objects include n-time use keys, n-out-of-m data blobs,n-copy migratable objects, and other variants, which have many potential uses in digital rights management (DRM), digital cash, digital voting, itinerant computing,and other application areas.

Refactoring for parameterizing Java classes

2006年9月05日 00:00:00 GMT

Refactoring for parameterizing Java classes Kiezun, Adam; Ernst, Michael D.; Tip, Frank; Fuhrer, Robert M. Type safety and expressiveness of many existing Java libraries and theirclient applications would improve, if the libraries were upgraded to definegeneric classes. Efficient and accurate tools exist to assist clientapplications to use generics libraries, but so far the libraries themselvesmust be parameterized manually, which is a tedious, time-consuming, anderror-prone task. We present a type-constraint-based algorithm forconverting non-generic libraries to add type parameters. The algorithmhandles the full Java language and preserves backward compatibility, thusmaking it safe for existing clients. Among other features, it is capableof inferring wildcard types and introducing type parameters formutually-dependent classes. We have implemented the algorithm as a fullyautomatic refactoring in Eclipse.We evaluated our work in two ways. First, our tool parameterized code thatwas lacking type parameters. We contacted the developers of several ofthese applications, and in all cases where we received a response, theyconfirmed that the resulting parameterizations were correct and useful.Second, to better quantify its effectiveness, our tool parameterizedclasses from already-generic libraries, and we compared the results tothose that were created by the libraries' authors. Our tool performed therefactoring accurately -- in 87% of cases the results were as good as thosecreated manually by a human expert, in 9% of cases the tool results werebetter, and in 4% of cases the tool results were worse.

Task-Structured Probabilistic I/O Automata

2006年9月05日 00:00:00 GMT

Task-Structured Probabilistic I/O Automata Canetti,, Ran; Cheung,, Ling; Kaynar,, Dilsun; Liskov,, Moses; Lynch,, Nancy; Pereira,, Olivier; Segala, Roberto Modeling frameworks such as Probabilistic I/O Automata (PIOA) andMarkov Decision Processes permit both probabilistic andnondeterministic choices. In order to use such frameworks to express claims about probabilities of events, one needs mechanisms for resolving nondeterministic choices. For PIOAs, nondeterministic choices have traditionally been resolved by schedulers that have perfect information about the past execution. However, such schedulers are too powerful for certain settings, such as cryptographic protocol analysis, where information must sometimes be hidden. Here, we propose a new, less powerful nondeterminism-resolutionmechanism for PIOAs, consisting of tasks and local schedulers.Tasks are equivalence classes of system actions that are scheduled byoblivious, global task sequences. Local schedulers resolve nondeterminism within system components, based on local information only. The resulting task-PIOA framework yields simple notions of external behavior and implementation, and supports simple compositionality results.We also define a new kind of simulation relation, and show it to besound for proving implementation. We illustrate the potential of the task-PIOA framework by outlining its use in verifying an Oblivious Transfer protocol.

Javari: Adding Reference Immutability to Java

2006年9月05日 00:00:00 GMT

Javari: Adding Reference Immutability to Java Tschantz, Matthew S. This paper describes a programming language, Javari, that is capable of expressing and enforcing immutability constraints. The specific constraint expressed is that the abstract state of the object to which an immutable reference refers cannot be modified using that reference. The abstract state is (part of) the transitively reachable state: that is, the state of the object and all state reachable from it by following references. The type system permits explicitly excluding fields from the abstract state of an object. For a statically type-safe language, the type system guarantees reference immutability.The type system is distinguishes the notions of assignability and mutability; integrates with Java's generic types and with multi-dimensional arrays; provides a mutability polymorphism approach to avoiding code duplication; and has type-safe support for reflection and serialization. This paper describes a core calculus including formal type rules for the language.Additionally, this paper describes a type inference algorithm that can be used convert existing Java programs to Javari. Experimental results from a prototype implementation of the algorithm are presented. MEng thesis

Random Lens Imaging

2006年9月02日 00:00:00 GMT

Random Lens Imaging Fergus, Rob; Torralba, Antonio; Freeman, William T. We call a random lens one for which the function relating the input light ray to the output sensor location is pseudo-random. Imaging systems with random lensescan expand the space of possible camera designs, allowing new trade-offs in optical design and potentially adding new imaging capabilities. Machine learningmethods are critical for both camera calibration and image reconstruction from the sensor data. We develop the theory and compare two different methods for calibration and reconstruction: an MAP approach, and basis pursuit from compressive sensing. We show proof-of-concept experimental results from a random lens made from a multi-faceted mirror, showing successful calibration and image reconstruction. We illustrate the potential for super-resolution and 3D imaging.

Finding the needles in the haystack: Generating legal test inputs for object-oriented programs

2006年8月31日 00:00:00 GMT

Finding the needles in the haystack: Generating legal test inputs for object-oriented programs Artzi, Shay; Ernst, Michael D.; Kiezun, Adam; Pacheco, Carlos; Perkins, Jeff H. A test input for an object-oriented program typically consists of asequence of method calls that use the API defined by the programunder test. Generating legal test inputs can be challenging because,for some programs, the set of legal method sequences is much smallerthan the set of all possible sequences; without a formalspecification of legal sequences, an input generator is bound toproduce mostly illegal sequences.We propose a scalable technique that combines dynamic analysis withrandom testing to help an input generator create legal test inputswithout a formal specification, even for programs in whichmost sequences are illegal. The technique uses an example executionof the program to infer a model of legal call sequences, and usesthe model to guide a random input generator towards legal butbehaviorally-diverse sequences.We have implemented our technique for Java, in a tool calledPalulu, and evaluated its effectiveness in creating legal inputsfor real programs. Our experimental results indicate that thetechnique is effective and scalable. Our preliminary evaluationindicates that the technique can quickly generate legal sequencesfor complex inputs: in a case study, Palulu created legal testinputs in seconds for a set of complex classes, for which it took anexpert thirty minutes to generate a single legal input.

Learning with Online Constraints: Shifting Concepts and Active Learning

2006年9月01日 00:00:00 GMT

Learning with Online Constraints: Shifting Concepts and Active Learning Monteleoni, Claire E. Many practical problems such as forecasting, real-time decisionmaking, streaming data applications, and resource-constrainedlearning, can be modeled as learning with online constraints. Thisthesis is concerned with analyzing and designing algorithms forlearning under the following online constraints: 1) The algorithm hasonly sequential, or one-at-time, access to data. 2) The time andspace complexity of the algorithm must not scale with the number ofobservations. We analyze learning with online constraints in avariety of settings, including active learning. The active learningmodel is applicable to any domain in which unlabeled data is easy tocome by and there exists a (potentially difficult or expensive)mechanism by which to attain labels.First, we analyze a supervised learning framework in which nostatistical assumptions are made about the sequence of observations,and algorithms are evaluated based on their regret, i.e. theirrelative prediction loss with respect to the hindsight-optimalalgorithm in a comparator class. We derive a lower bound on regretfor a class of online learning algorithms designed to track shiftingconcepts in this framework. We apply an algorithm we provided inprevious work, that avoids this lower bound, to an energy-managementproblem in wireless networks, and demonstrate this application in anetwork simulation. Second, we analyze a supervised learning frameworkin which the observations are assumed to be iid, and algorithms arecompared by the number of prediction mistakes made in reaching atarget generalization error. We provide a lower bound on mistakes forPerceptron, a standard online learning algorithm, for this framework.We introduce a modification to Perceptron and show that it avoids thislower bound, and in fact attains the optimal mistake-complexity forthis setting.Third, we motivate and analyze an online active learning framework.The observations are assumed to be iid, and algorithms are judged bythe number of label queries to reach a target generalizationerror. Our lower bound applies to the active learning setting as well,as a lower bound on labels for Perceptron paired with any activelearning rule. We provide a new online active learning algorithm thatavoids the lower bound, and we upper bound its label-complexity. Theupper bound is optimal and also bounds the algorithm's total errors(labeled and unlabeled). We analyze the algorithm further, yielding alabel-complexity bound under relaxed assumptions. Using opticalcharacter recognition data, we empirically compare the new algorithmto an online active learning algorithm with data-dependent performanceguarantees, as well as to the combined variants of these twoalgorithms. PhD thesis

Predicting the Risk and Trajectory of Intensive Care Patients Using Survival Models

2006年8月30日 00:00:00 GMT

Predicting the Risk and Trajectory of Intensive Care Patients Using Survival Models Hug, Caleb W. Using artificial intelligence to assist physicians in patient care has received sustained interest over the past several decades. Recently, with automated systems at most bedsides, the amount of patient information collected continues to increase, providing specific impetus for intelligent systems that can interpret this information. In fact, the large set of sensors and test results, often measured repeatedly over long periods of time, make it challenging for caregivers to quickly utilize all of the data for optimal patient treatment.This research focuses on predicting the survival of ICU patients throughout their stay. Unlike traditional static mortality models, this survival prediction is explored as an indicator of patient state and trajectory. Using survival analysis techniques and machine learning, models are constructed that predict individual patient survival probabilities at fixed intervals in the future. These models seek to help physicians interpret the large amount of data available in order to provide optimal patient care.We find that the survival predictions from our models are comparable to survival predictions using the SAPS score, but are available throughout the patient's ICU course instead of only at 24 hours after admission. Additionally, we demonstrate effective prediction of patient mortality over fixed windows in the future. SM thesis

Anthills Built to Order: Automating Construction with Artificial Swarms

2006年5月12日 00:00:00 GMT

Anthills Built to Order: Automating Construction with Artificial Swarms Werfel, Justin Social insects build large, complex structures, which emerge through the collective actions of many simple agents acting with no centralized control or preplanning. These natural systems motivate investigating the use of artificial swarms to automate construction or fabrication. The goal is to be able to take an unspecified number of simple robots and a supply of building material, give the system a high-level specification for any arbitrary structure desired, and have a guarantee that it will produce that structure without further intervention.In this thesis I describe such a distributed system for automating construction, in which autonomous mobile robots collectively build user-specified structures from square building blocks. The approach preserves many desirable features of the natural systems, such as considerable parallelism and robustness to factorslike robot loss and variable order or timing of actions. Further, unlike insect colonies, it can build particular desired structures according to a high-level design provided by the user.Robots in this system act without explicit communication or cooperation, instead using the partially completed structure to coordinate their actions. This mechanism is analogous to that of stigmergy used by social insects, in which insects take actions that affect the environment, and the environmental state influences further actions. I introduce a framework of "extended stigmergy" in which building blocks are allowed to store, process or communicate information. Increasing the capabilities of the building material (rather than of the robots) in this way increases the availability of nonlocal structure information. Benefits include significant improvements in construction speed and in ability to take advantage of the parallelism of the swarm.This dissertation describes system design and control rules for decentralized teams of robots that provably build arbitrary solid structures in two dimensions. I present a hardware prototype, and discuss extensions to more general structures, including those built with multiple block types and in three dimensions. PhD thesis

Resilient Network Coding In the Presence of Byzantine Adversaries

2006年8月05日 00:00:00 GMT

Resilient Network Coding In the Presence of Byzantine Adversaries Jaggi, Sidharth; Langberg, Michael; Katti, Sachin; Ho, Tracy; Katabi, Dina; Medard, Muriel Network coding substantially increases network throughput. But since it involves mixing of information inside the network, a single corrupted packet generated by a malicious node can end up contaminating all the information reaching a destination, preventing decoding. This paper introduces the first distributed polynomial-time rate-optimal network codes that work in the presence of Byzantine nodes. We present algorithms that target adversaries with different attacking capabilities. When the adversary can eavesdrop on all links and jam Z links , our first algorithm achieves a rate of C-2Z, where C is the network capacity. In contrast, when the adversary has limited snooping capabilities, we provide algorithms that achieve the higher rate of C-Z.

Human Document Classification Using Bags of Words

2006年8月09日 00:00:00 GMT

Human Document Classification Using Bags of Words Wolf, Florian; Poggio, Tomaso; Sinha, Pawan Humans are remarkably adept at classifying text documents into cate-gories. For instance, while reading a news story, we are rapidly able to assess whether it belongs to the domain of finance, politics or sports. Automating this task would have applications for content-based search or filtering of digital documents. To this end, it is interesting to investigate the nature of information humans use to classify documents. Here we report experimental results suggesting that this information might, in fact, be quite simple. Using a paradigm of progressive revealing, we determined classification performance as a function of number of words. We found that subjects are able to achieve similar classification accuracy with or without syntactic information across a range of passage sizes. These results have implications for models of human text-understanding and also allow us to estimate what level of performance we can expect, in principle, from a system without requiring a prior step of complex natural language processing.

Dealers, Insiders and Bandits: Learning and its Effects on Market Outcomes

2006年7月12日 00:00:00 GMT

Dealers, Insiders and Bandits: Learning and its Effects on Market Outcomes Das, Sanmay This thesis seeks to contribute to the understanding of markets populated by boundedly rational agents who learn from experience. Bounded rationality and learning have both been the focus of much research in computer science, economics and finance theory. However, we are at a critical stage in defining the direction of future research in these areas. It is now clear that realistic learning problems faced by agents in market environments are often too hard to solve in a classically rational fashion. At the same time, the greatly increased computational power available today allows us to develop and analyze richer market models and to evaluate different learning procedures and algorithms within these models. The danger is that the ease with which complex markets can be simulated could lead to a plethora of models that attempt to explain every known fact about different markets. The first two chapters of this thesis define a principled approach to studying learning in rich models of market environments, and the rest of the thesis provides a proof of concept by demonstrating the applicability of this approach in modeling settings drawn from two different broad domains, financial market microstructure and search theory. In the domain of market microstructure, this thesis extends two important models from the theoretical finance literature. The third chapter introduces an algorithm for setting prices in dealer markets based on the model of Glosten and Milgrom (1985), and produces predictions about the behavior of prices in securities markets. In some cases, these results confirm economic intuitions in a significantly more complex setting (like the existence of a local profit maximum for a monopolistic market-maker) and in others they can be used to provide quantitative guesses for variables such as rates of convergence to efficient market conditions following price jumps that provide insider information. The fourth chapter studies the problem faced by a trader with insider information in KyleÂ’s (1985) model. I show how the insider trading problem can be usefully analyzed from the perspective of reinforcement learning when some important market parameters are unknown, and that the equilibrium behavior of an insider who knows these parameters can be learned by one who does not, but also that the time scale of convergence to the equilibrium behavior may be impractical, and agents with limited time horizons may be better off using approximate algorithms that do not converge to equilibrium behavior. The fifth and sixth chapters relate to search problems. Chapter 5 introduces models for a class of problems in which there is a search Â“seasonÂ” prior to hiring or matching, like academic job markets. It solves for expected values in many cases, and studies the difference between a Â“high informationÂ” process where applicants are immediately told when they have been rejected and a Â“low informationÂ” process where employers do not send any signal when they reject an applicant. The most important intuition to emerge from the results is that the relative benefit of the high information process is much greater when applicants do not know their own Â“attractiveness,Â” which implies that search markets might be able to eliminate inefficiencies effectively by providing good information, and we do not always have to think about redesigning markets as a whole. Chapter 6 studies two-sided search explicitly and introduces a new class of multi-agent learning problems, two-sided bandit problems, that capture the learning and decision problems of agents in matching markets in which agents must learn their preferences. It also empirically studies outcomes under different periodwise matching mechanisms and shows that some basic intuitions about the asymptotic stability of matchings are preserved in the model. For example, when agents are matched in each period using the Gale-Shapley algorithm, asymptotic outcomes are always stable, while a matching mechanism that induces a stopping problem for some agents leads to the lowest probabilities of stability. By contributing to the state of the art in modeling different domains using computational techniques, this thesis demonstrates the success of the approach to modeling complex economic and social systems that is prescribed in the first two chapters. PhD thesis

Iterative Collaborative Ranking of Customers and Providers

2006年7月04日 00:00:00 GMT

Iterative Collaborative Ranking of Customers and Providers Teow, Loo Nin; Katabi, Dina This paper introduces a new application: predicting the Internet provider-customer market. We cast the problem in the collaborative filtering framework, where we use current and past customer-provider relationships to compute for each Internet customer a ranking of potential future service providers. Furthermore, for each Internet service provider (ISP), we rank potential future customers. We develop a novel iterative ranking algorithm that draws inspiration from several sources, including collaborative filtering, webpage ranking, and kernel methods. Further analysis of our algorithm shows that it can be formulated in terms of an affine eigenvalue problem. Experiments on the actual Internet customer-provider data show promising results.

MORE: A Network Coding Approach to Opportunistic Routing

2006年6月30日 00:00:00 GMT

MORE: A Network Coding Approach to Opportunistic Routing Chachulski, Szymon; Jennings, Michael; Katti, Sachin; Katabi, Dina Opportunistic routing has the potential to substantially increase wireless network throughput. Prior work on opportunistic routing, however, requires tight node coordination. Different nodes in a network must have knowledge of which packets other nodes have received. Furthermore, the nodes have to agree on which nodes should transmit which packets. Such coordination becomes fragile in dense or large networks.This paper introduces MORe, a new opportunistic routing protocol that avoids node-coordination. Our design is rooted in the theory of network coding.Routers code packets going to the same destination and forward the coded versions. The destination decodes and recovers the original packets. This approach needs no coordination and provably maximizes network throughput. We have implemented our design and evaluated it in a 25-node testbed. Our results show that MORE provides an average throughput increase of 60% and a maximum of 10-fold, demonstrating that the theoretical gains promised by network coding are realizable in practice.

Robust Execution of Bipedal Walking Tasks From Biomechanical Principles

2006年4月28日 00:00:00 GMT

Robust Execution of Bipedal Walking Tasks From Biomechanical Principles Hofmann, Andreas Effective use of robots in unstructured environments requires that they have sufficient autonomy and agility to execute task-level commands successfully. A challenging example of such a robot is a bipedal walking machine. Such a robot should be able to walk to a particular location within a particular time, while observing foot placement constraints, and avoiding a fall, if this is physically possible. Although stable walking machines have been built, the problem of task-level control, where the tasks have stringent state-space and temporal requirements, and where significant disturbances may occur, has not been studied extensively. This thesis addresses this problem through three objectives. The first is to devise a plan specification where task requirements are expressed in a qualitative form that provides for execution flexibility. The second is to develop a task-level executive that accepts such a plan, and outputs a sequence of control actions that result in successful plan execution. The third is to provide this executive with disturbance handling ability.Development of such an executive is challenging because the biped is highly nonlinear and has limited actuation due to its limited base of support. We address these challenges with three key innovations. To address the nonlinearity, we develop a dynamic virtual model controller to linearize the biped, and thus, provide an abstracted biped that is easier to control. The controller is model-based, but uses a sliding control technique to compensate for model inaccuracy. To address the under-actuation, our system generates flow tubes, which define valid operating regions in the abstracted biped. The flow tubes represent sets of state trajectories that take into account dynamic limitations due to under-actuation, and also satisfy plan requirements. The executive keeps trajectories in the flow tubes by adjusting a small number of control parameters for key state variables in the abstracted biped, such as center of mass. Additionally, our system uses a novel strategy that employs angular momentum to enhance translational controllability of the systemÂ’s center of mass. We evaluate our approach using a high-fidelity biped simulation. Tests include walking with foot-placement constraints, kicking a soccer ball, and disturbance recovery. PhD thesis

Was the Patient Cured? Understanding Semantic Categories and Their Relationships in Patient Records

2006年6月28日 00:00:00 GMT

Was the Patient Cured? Understanding Semantic Categories and Their Relationships in Patient Records Sibanda, Tawanda Carleton In this thesis, we detail an approach to extracting key information in medical discharge summaries. Starting with a narrative patient report, we first identify and remove information that compromises privacy (de-identification);next we recognize words and phrases in the text belonging to semantic categories of interest to doctors (semantic category recognition).For disease and symptoms, we determine whether the problem is present, absent, uncertain, or associated with somebody else (assertion classification). Finally, we classify the semantic relationships existing between our categories (semantic relationship classification).Our approach utilizes a series of statistical models that rely heavily on local lexical and syntactic context, and achieve competitive results compared to more complexNLP solutions. We conclude the thesis by presenting the design for the Category and Relationship Extractor (CaRE). CaRE combines our solutions to de-identification, semantic category recognition, assertion classification, and semantic relationship classification into a singleapplication that facilitates the easy extraction of semantic information from medical text. MEng thesis

Using Task-Structured Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol

2006年6月20日 00:00:00 GMT

Using Task-Structured Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol Canetti, Ran; Cheung, Ling; Kaynar, Dilsun; Liskov, Moses; Lynch, Nancy; Pereira, Olivier; Segala, Roberto The Probabilistic I/O Automata framework of Lynch, Segala and Vaandrager provides tools for precisely specifying protocols and reasoning about theircorrectness using multiple levels of abstraction, based on implementation relationships between these levels. We enhance this framework to allow analyzingprotocols that use cryptographic primitives. This requires resolving andreconciling issues such as nondeterministic behavior and scheduling, randomness,resource-bounded computation, and computational hardness assumptions. The enhanced framework allows for more rigorous and systematic analysis of cryptographic protocols. To demonstrate the use of this framework, we present an example analysis that we have done for an Oblivious Transfer protocol.

Using Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol

2006年6月19日 00:00:00 GMT

Using Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol Canetti, Ran; Cheung, Ling; Kaynar, Dilsun; Liskov, Moses; Lynch, Nancy; Pereira, Olivier; Segala, Roberto We demonstrate how to carry out cryptographic security analysis ofdistributed protocols within the Probabilistic I/O Automataframework of Lynch, Segala, and Vaandrager. This framework providestools for arguing rigorously about the concurrency and schedulingaspects of protocols, and about protocols presented at differentlevels of abstraction. Consequently, it can help in makingcryptographic analysis more precise and less susceptible to errors.We concentrate on a relatively simple two-party Oblivious Transferprotocol, in the presence of a semi-honest adversary (essentially,an eavesdropper). For the underlying cryptographic notion ofsecurity, we use a version of Canetti's Universally Composablesecurity.In spite of the relative simplicity of the example, the exercise isquite nontrivial. It requires taking many fundamental issues intoaccount, including nondeterministic behavior, scheduling,resource-bounded computation, and computational hardness assumptionsfor cryptographic primitives.

Approximate Correspondences in High Dimensions

2006年6月15日 00:00:00 GMT

Approximate Correspondences in High Dimensions Grauman, Kristen; Darrell, Trevor Pyramid intersection is an efficient method for computing an approximate partial matching between two sets of feature vectors. We introduce a novel pyramid embedding based on a hierarchy of non-uniformly shaped bins that takes advantage of the underlying structure of the feature space and remains accurate even for sets with high-dimensional feature vectors. The matching similarity is computed in linear time and forms a Mercer kernel. We also show how the matching itself (a correspondence field) may be extracted for a small increase in computational cost. Whereas previous matching approximation algorithms suffer from distortion factors that increase linearly with the feature dimension, we demonstrate thatour approach can maintain constant accuracy even as the feature dimension increases. When used as a kernel in a discriminative classifier, our approach achieves improved object recognition results over a state-of-the-art set kernel.

New Techniques for Geographic Routing

2006年6月14日 00:00:00 GMT

New Techniques for Geographic Routing Leong, Ben As wireless sensor networks continue to grow in size, we are facedwith the prospect of emerging wireless networks with hundreds orthousands of nodes. Geographic routing algorithms are a promisingalternative to tradition ad hoc routing algorithms in this new domainfor point-to-point routing, but deployments of such algorithms arecurrently uncommon because of some practical difficulties.This dissertation explores techniques that address two major issues inthe deployment of geographic routing algorithms: (i) the costsassociated with distributed planarization and (ii) the unavailabilityof location information. We present and evaluate two new algorithmsfor geographic routing: Greedy Distributed Spanning Tree Routing(GDSTR) and Greedy Embedding Spring Coordinates (GSpring).Unlike previous geographic routing algorithms which require theplanarization of the network connectivity graph, GDSTR switches torouting on a spanning tree instead of a planar graph when packets endup at dead ends during greedy forwarding. To choose a direction on thetree that is most likely to make progress towards the destination,each GDSTR node maintains a summary of the area covered by the subtreebelow each of its tree neighbors using convex hulls. This distributeddata structure is called a hull tree. GDSTR not only requires an orderof magnitude less bandwidth to maintain these hull trees than CLDP,the only distributed planarization algorithm that is known to workwith practical radio networks, it often achieves better routingperformance than previous planarization-based geographic routingalgorithms.GSpring is a new virtual coordinate assignment algorithm that derivesgood coordinates for geographic routing when location information isnot available. Starting from a set of initial coordinates for a set ofelected perimeter nodes, GSpring uses a modified spring relaxationalgorithm to incrementally adjust virtual coordinates to increase theconvexity of voids in the virtual routing topology. This reduces theprobability that packets will end up in dead ends during greedyforwarding, and improves the routing performance of existinggeographic routing algorithms.The coordinates derived by GSpring yield comparable routingperformance to that for actual physical coordinates and significantlybetter performance than that for NoGeo, the best existing algorithmfor deriving virtual coordinates for geographic routing. Furthermore,GSpring is the first known algorithm that is able to derivecoordinates that achieve better geographic routing performance thanactual physical coordinates for networks with obstacles. PhD thesis

Schematic Querying of Large Tracking Databases

2006年6月12日 00:00:00 GMT

Schematic Querying of Large Tracking Databases Dalley, Gerald; Izo, Tomas In dealing with long-term tracking databases withwide-area coverage, an important problem is in formulating anintuitive and fast query system for analysis. In such a querysystem, a user who is not a computer vision research should beable to readily specify a novel query to the system and obtainthe desired results. Furthermore, these queries should be able tonot only search out individual actors (e.g. "find all white cars")but also find interactions amongst multiple actors (e.g. "find alldrag racing activities in the city"). Informally, we have foundthat people often use sketches when describing activities andinteractions. In this paper, we demonstrate a preliminary systemthat automatically interprets schematic drawings of activities.The system transforms the schematics into executable code thatsearches a tracking database. Through our query optimization,these queries tend to take orders of magnitude less time to executethan equivalent queries running on a partially-optimized SQLdatabase.

Infrastructure for Engineered Emergence on Sensor/Actuator Networks

2006年3月01日 00:00:00 GMT

Infrastructure for Engineered Emergence on Sensor/Actuator Networks Beal, Jacob; Bachrach, Jonathan The ability to control emergent phenomena depends on decomposingthem into aspects susceptible to independent engineering. Forspatial self-managing systems, the amorphous-medium abstraction lets youseparate the systemÂ’s specification from its implementation.

CogSci to AI: It's the Brainware, Stupid!

2006年3月01日 00:00:00 GMT

CogSci to AI: It's the Brainware, Stupid! Beal, Jacob; Sussman, Gerald Current modularization techniques fail when applied to hard AI problems.But cognitive science shows that the mind has modules specialized for particular functions.Unlike current engineered modules, the modules of themind learn to communicate with each other as a child matures.Kirby's ideas on language evolution, combined with constraints derivedfrom neuroanatomy, yield a new mechanism for integrating modules intoa system: a communications bootstrapping system in which two agentsbuild a shared vocabulary capturing information common to their mutualexperience, including cross-module knowledge about the world.

Amorphous Medium Language

2005年7月01日 00:00:00 GMT

Amorphous Medium Language Beal, Jacob Programming reliable behavior on a large mesh network composed of unreliable parts is difficult. Amorphous Medium Language addresses this problem by abstracting robustness and networking issues away from the programmer via language of geometric primitives and homeostasis maintenance.AML is designed to operate on a high diameter network composed of thousands to billions of nodes, and does not assume coordinate, naming, or routing services. Computational processes are distributed through geometric regions of the space approximated by the network and specify behavior in terms of homeostasis conditions and actions to betaken when homeostasis is violated.AML programs are compiled for local execution using previously developed amorphous computing primitives which provide robustness against ongoing failures and joins and localize the impact of changes in topology. I show some examples of how AML allows complex robust behavior to be expressed in simple programs and some preliminary results from simulation.

Programming an Amorphous Computational Medium

2004年9月01日 00:00:00 GMT

Programming an Amorphous Computational Medium Beal, Jacob Amorphous computing considers the problem of controllingmillions of spatially distributed unreliable devices which communicateonly with nearby neighbors. To program such a system, we need a highleveldescription language for desired global behaviors, and a system tocompile such descriptions into locally executing code which robustly createsand maintains the desired global behavior. I survey existing amorphouscomputing primitives and give desiderata for a language describingcomputation on an amorphous computer. I then bring these together inAmorphous Medium Language, which computes on an amorphous computeras though it were a space-filling computational medium.

What the Assassin's Guild Taught Me About Distributed Computing

2006年5月27日 00:00:00 GMT

What the Assassin's Guild Taught Me About Distributed Computing Beal, Jacob Distributed computing and live-action roleplaying share many of thesame fundamental problems, as live-action roleplaying games commonly include simulations carried out by their players.Games run by the MIT Assassin's Guild are particularly illustrative ofdistributed computing issues due to their large scope and highcomplexity.I discuss three distributed computing issues addressed by Assassin'sGuild game design---information hiding, error correction, andliveness/consistency tradeoffs---and the relevance of the solutionsused by game writers to current problems in distributed computing.

First Class Copy & Paste

2006年5月22日 00:00:00 GMT

First Class Copy & Paste Edwards, Jonathan The Subtext project seeks to make programming fundamentally easier by altering the nature of programming languages and tools. This paper defines an operational semantics for an essential subset of the Subtext language. It also presents a fresh approach to the problems of mutable state, I/O, and concurrency.Inclusions reify copy & paste edits into persistent relationships that propagate changes from their source into their destination. Inclusions formulate a programming language in which there is no distinction between a programÂ’s representation and its execution. Like spreadsheets, programs are live executions within a persistent runtime, and programming is direct manipulation of these executions via a graphical user interface. There is no need to encode programs into source text.Mutation of state is effected by the computation of hypothetical recursive variants of the state, which can then be lifted into new versions of the state. Transactional concurrency is based upon queued single-threaded execution. Speculative execution of queued hypotheticals provides concurrency as a semantically transparent implementation optimization.

Learning using the Born Rule

2006年5月16日 00:00:00 GMT

Learning using the Born Rule Wolf, Lior In Quantum Mechanics the transition from a deterministic descriptionto a probabilistic one is done using a simple rule termed the Bornrule. This rule states that the probability of an outcome ($a$)given a state ($\Psi$) is the square of their inner products($(a^\top\Psi)^2$).In this paper, we unravel a new probabilistic justification forpopular algebraic algorithms, based on the Born rule. Thesealgorithms include two-class and multiple-class spectral clustering,and algorithms based on Euclidean distances.

A Machine-Checked Safety Proof for a CISC-Compatible SFI Technique

2006年5月11日 00:00:00 GMT

A Machine-Checked Safety Proof for a CISC-Compatible SFI Technique McCamant, Stephen Executing untrusted code while preserving security requires that thecode be prevented from modifying memory or executing instructionsexcept as explicitly allowed. Software-based fault isolation (SFI) or"sandboxing" enforces such a policy by rewriting code at theinstruction level. In previous work, we developed a new SFI techniquethat is applicable to CISC architectures such as the Intel IA-32,based on enforcing additional alignment constraints to avoiddifficulties with variable-length instructions. This report describesa machine-checked proof we developed to increase our confidence in thesafety provided by the technique. The proof, constructed for asimplified model of the technique using the ACL2 theorem provingenvironment, certifies that if the code rewriting has been checked tohave been performed correctly, the resulting program cannot perform adangerous operation when run. We describe the high-level structure ofthe proof, then give the intermediate lemmas with interspersedcommentary, and finally evaluate the process of the proof'sconstruction.

Learning a Dictionary of Shape-Components in Visual Cortex: Comparison with Neurons, Humans and Machines

2006年4月25日 00:00:00 GMT

Learning a Dictionary of Shape-Components in Visual Cortex: Comparison with Neurons, Humans and Machines Serre, Thomas In this thesis, I describe a quantitative model that accounts for the circuits and computations of the feedforward path of the ventral stream of visual cortex. This model is consistent with a general theory of visual processing that extends the hierarchical model of (Hubel & Wiesel, 1959) from primary to extrastriate visual areas. It attempts to explain the first few hundred milliseconds of visual processing and Â“immediate recognitionÂ”. One of the key elements in the approach is the learning of a generic dictionary of shape-components from V2 to IT, which provides an invariant representation to task-specific categorization circuits in higher brain areas. This vocabulary of shape-tuned units is learned in an unsupervised manner from natural images, and constitutes a large and redundant set of image features with different complexities and invariances. This theory significantly extends an earlier approach by (Riesenhuber & Poggio, 1999) and builds upon several existing neurobiological models and conceptual proposals.First, I present evidence to show that the model can duplicate the tuning properties of neurons in various brain areas (e.g., V1, V4 and IT). In particular, the model agrees with data from V4 about the response of neurons to combinations of simple two-bar stimuli (Reynolds et al, 1999) (within the receptive field of the S2 units) and some of the C2 units in the model show a tuning for boundary conformations which is consistent with recordings from V4 (Pasupathy & Connor, 2001). Second, I show that not only can the model duplicate the tuning properties of neurons in various brain areas when probed with artificial stimuli, but it can also handle the recognition of objects in the real-world, to the extent of competing with the best computer vision systems. Third, I describe a comparison between the performance of the model and the performance of human observers in a rapid animal vs. non-animal recognition task for which recognition is fast and cortical back-projections are likely to be inactive. Results indicate that the model predicts human performance extremely well when the delay between the stimulus and the mask is about 50 ms. This suggests that cortical back-projections may not play a significant role when the time interval is in this range, and the model may therefore provide a satisfactory description of the feedforward path.Taken together, the evidences suggest that we may have the skeleton of a successful theory of visual cortex. In addition, this may be the first time that a neurobiological model, faithful to the physiology and the anatomy of visual cortex, not only competes with some of the best computer vision systems thus providing a realistic alternative to engineered artificial vision systems, but also achieves performance close to that of humans in a categorization task involving complex natural images. PhD thesis

Abstraction Layers for Scalable Microfluidic Biocomputers (Extended Version)

2006年5月05日 00:00:00 GMT

Abstraction Layers for Scalable Microfluidic Biocomputers (Extended Version) Thies, William; Urbanski, John Paul; Thorsen, Todd; Amarasinghe, Saman Microfluidic devices are emerging as an attractive technology for automatically orchestrating the reactions needed in a biological computer. Thousands of microfluidic primitives have already been integrated on a single chip, and recent trends indicate that the hardware complexity is increasing at rates comparable to Moore's Law. As in the case of silicon, it will be critical to develop abstraction layers--such as programming languages and Instruction Set Architectures (ISAs)--that decouple software development from changes in the underlying device technology.Towards this end, this paper presents BioStream, a portable language for describing biology protocols, and the Fluidic ISA, a stable interface for microfluidic chip designers. A novel algorithm translates microfluidic mixing operations from the BioStream layer to the Fluidic ISA. To demonstrate the benefits of these abstraction layers, we build two microfluidic chips that can both execute BioStream code despite significant differences at the device level. We consider this to be an important step towards building scalable biocomputers.

Supplement to "Distributed Quota Enforcement for Spam Control"

2006年4月29日 00:00:00 GMT

Supplement to "Distributed Quota Enforcement for Spam Control" Walfish, Michael; Zamfirescu, J.D.; Balakrishnan, Hari; Karger, David; Shenker, Scott This report is a supplement to our paper "Distributed Quota Enforcement forSpam Control" (NSDI 2006). We assume here that the reader has readthe main paper. In this report, we first analyze the enforcer nodes'key-value maps and then analyze two of the experiments from the main paper.

A Combined Stochastic and Greedy Hybrid Estimation Capability for Concurrent Hybrid Models with Autonomous Mode Transitions

2006年4月28日 00:00:00 GMT

A Combined Stochastic and Greedy Hybrid Estimation Capability for Concurrent Hybrid Models with Autonomous Mode Transitions Blackmore, Lars; Funiak, Stanislav; Williams, Brian Robotic and embedded systems have become increasingly pervasive in applicationsranging from space probes and life support systems to robot assistants. In order to act robustly in the physical world, robotic systems must be able to detect changes in operational mode, such as faults, whose symptoms manifest themselves only in the continuous state. In such systems, the state is observed indirectly, and must therefore be estimated in a robust, memory-efficient manner from noisy observations.Probabilistic hybrid discrete/continuous models, such as Concurrent Probabilistic Hybrid Automata (CPHA) are convenient modeling tools for such systems. In CPHA, the hidden state is represented with discrete and continuous state variables that evolve probabilistically. In this paper, we present a novel method for estimating the hybrid state of CPHA that achieves robustness by balancing greedy and stochastic search. The key insight is that stochastic and greedy search methods, taken together, are often particularly effective in practice.To accomplish this, we first develop an efficient stochastic sampling approach for CPHA based on Rao-Blackwellised Particle Filtering. We then propose a strategy for mixing stochastic and greedy search. The resulting method is able to handle three particularly challenging aspects of real-world systems, namely that they 1) exhibit autonomous mode transitions, 2) consist of a large collection of concurrently operating components, and 3) are non-linear. Autonomous mode transitions, that is, discrete transitions that depend on thecontinuous state, are particularly challenging to address, since they couple the discrete and continuous state evolution tightly. In this paper we extend the class of autonomous mode transitions that can be handled to arbitrary piecewise polynomial transition distributions.We perform an empirical comparison of the greedy and stochastic approaches to hybrid estimation, and then demonstrate the robustness of the mixed method incorporated with our HME (Hybrid Mode Estimation) capability. We show that this robustness comes at only a small performance penalty.

A Probabilistic Particle Control Approach to Optimal, Robust Predictive Control

2006年4月28日 00:00:00 GMT

A Probabilistic Particle Control Approach to Optimal, Robust Predictive Control Blackmore, Lars Autonomous vehicles need to be able to plan trajectories to a specified goal that avoid obstacles, and are robust to the inherent uncertainty in the problem. This uncertainty arises due to uncertain state estimation, disturbances and modeling errors. Previous solutions to the robust path planning problem solved this problem using a finite horizon optimal stochastic control approach. This approach finds the optimal path subject to chance constraints, which ensure that the probability of collision with obstacles is below a given threshold. This approach is limited to problems where all uncertain distributions are Gaussian, and typically result in highly conservative plans. In many cases, however, the Gaussian assumption is invalid; for example in the case of localization, the belief state about a vehicleÂ’s position can consist of highly non-Gaussian, even multimodal, distributions.In this paper we present a novel method for finite horizon stochastic control ofdynamic systems subject to chance constraints. The method approximates the distribution of the system state using a finite number of particles. By expressing these particles in terms of the control variables, we are able to approximate the original stochastic control problem as a deterministic one; furthermore the approximation becomes exact as the number of particles tends to infinity. For a general class of chance constrained problems with linear system dynamics, we show that the approximate problem can be solved using efficient Mixed-Integer Linear Programming techniques. We apply the new method to aircraft control in turbulence, and show simulation results that demonstrate the efficacy of the approach.

Coordinating Agile Systems through the Model-based Execution of Temporal Plans

2006年4月28日 00:00:00 GMT

Coordinating Agile Systems through the Model-based Execution of Temporal Plans Leaute, Thomas Agile autonomous systems are emerging, such as unmanned aerial vehicles (UAVs), that must robustly perform tightly coordinated time-critical missions; for example, military surveillance or search-and-rescue scenarios. In the space domain, execution of temporally flexible plans has provided an enabler for achieving the desired coordination and robustness, in the context of space probes and planetary rovers, modeled as discrete systems. We address the challenge of extending plan execution to systems with continuous dynamics, such as air vehicles and robot manipulators, and that are controlled indirectly through the setting of continuous state variables.Systems with continuous dynamics are more challenging than discrete systems, because they require continuous, low-level control, and cannot be controlled by issuing simple sequences of discrete commands. Hence, manually controlling these systems (or plants) at a low level can become very costly, in terms of the number of human operators necessary to operate the plant. For example, in the case of a fleet of UAVs performing a search-and-rescue scenario, the traditional approach to controlling the UAVs involves providing series of close waypoints for each aircraft, which incurs a high workload for the human operators, when the fleet consists of a large number of vehicles.Our solution is a novel, model-based executive, called Sulu, that takes as input a qualitative state plan, specifying the desired evolution of the state of the system. This approach elevates the interaction between the human operator and the plant, to a more abstract level where the operator is able to Â“coachÂ” the plant by qualitatively specifying the tasks, or activities, the plant must perform. These activities are described in a qualitative manner, because they specify regions in the plantÂ’s state space in which the plant must be at a certain point in time. Time constraints are also described qualitatively, in the form of flexible temporal constraints between activities in the state plan. The design of low-level control inputs in order to meet this abstract goal specification is then delegated to the autonomous controller, hence decreasing the workload per human operator. This approach also provides robustness to the executive, by giving it room to adapt to disturbances and unforeseen events, while satisfying the qualitative constraints on the plant state, specified in the qualitative state plan.Sulu reasons on a model of the plant in order to dynamically generate near-optimal control sequences to fulfill the qualitative state plan. To achieve optimality and safety, Sulu plans into the future, framing the problem as a disjunctive linear programming problem. To achieve robustness to disturbances and maintain tractability, planning is folded within a receding horizon, continuous planning and execution framework. The key to performance is a problem reduction method based on constraint pruning. We benchmark performance using multi-UAV firefighting scenarios on a real-time, hardware-in-the-loop testbed. SM thesis

Detecting and tracking multiple interacting objects without class-specific models

2006年4月25日 00:00:00 GMT

Detecting and tracking multiple interacting objects without class-specific models Bose, Biswajit; Wang, Xiaogang; Grimson, Eric We propose a framework for detecting and tracking multiple interacting objects from a single, static, uncalibrated camera. The number of objects is variable and unknown, and object-class-specific models are not available. We use background subtraction results as measurements for object detection and tracking. Given these constraints, the main challenge is to associate pixel measurements with (possibly interacting) object targets. We first track clusters of pixels, and note when they merge or split. We then build an inference graph, representing relations between the tracked clusters. Using this graph and a generic object model based on spatial connectedness and coherent motion, we label the tracked clusters as whole objects, fragments of objects or groups of interacting objects. The outputs of our algorithm are entire tracks of objects, which may include corresponding tracks from groups of objects during interactions. Experimental results on multiple video sequences are shown.

Of Malicious Motes and Suspicious Sensors

2006年4月19日 00:00:00 GMT

Of Malicious Motes and Suspicious Sensors Gilbert, Seth; Guerraoui, Rachid; Newport, Calvin How much damage can a malicious tiny device cause in a single-hopwireless network? Imagine two players, Alice and Bob, who want toexchange information. Collin, a malicious adversary, wants to preventthem from communicating. By broadcasting at the same time as Alice orBob, Collin can destroy their messages or overwhelm them with his ownmalicious data. Being a tiny device, however, Collin can onlybroadcast up to B times. Given that Alice and Bob do not knowB, and cannot distinguish honest from malicious messages, howlong can Collin prevent them from communicating? We show the answerto be 2B + Theta(lg|V|) communication rounds, where V is theset of values that Alice and Bob may transmit. We prove this resultto be optimal by deriving an algorithm that matches our lowerbound---even in the stronger case where Alice and Bob do not start thegame at the same time.We then argue that this specific 3-player game captures the generalextent to which a malicious adversary can disrupt coordination in asingle-hop wireless network. We support this claim by deriving---via reduction from the 3-player game---round complexity lower boundsfor several classical n-player problems: 2B + Theta(lg|V|) for reliable broadcast,2B + Omega(lg(n/k)) for leader election among k contenders,and 2B + Omega(k*lg(|V|/k)) for static k-selection. We then consider an extension of our adversary model that also includes up to t crash failures. We study binary consensus as the archetypal problem for this environment and show a bound of 2B + Theta(t) rounds. We conclude by providing tight, or nearly tight, upper bounds for all four problems. The new upper and lower bounds in this paper represent the first such results for a wireless network in which the adversary has the ability to disrupt communication.

Revisiting Internet Adressing: Back to the Future!

2006年4月14日 00:00:00 GMT

Revisiting Internet Adressing: Back to the Future! Vutukuru, Mythili; Feamster, Nick; Walfish, Michael; Balakrishnan, Hari; Shenker, Scott IP prefixes undermine three goals of Internet routing: accurate reflection of network-layer reachability, secure routing messages, and effective traffic control. This paper presents Atomic IP (AIP), a simple change to Internet addressing (which in fact reverts to how addressing once worked), that allows Internet routing to achieve these goals.

The Symmetriad: A Journey of Discovery Through the Land of the Polychora

2005年1月01日 00:00:00 GMT

The Symmetriad: A Journey of Discovery Through the Land of the Polychora Radul, Alexey I devised and implemented a method for constructing regular andsemiregular geometric objects in n-dimensional Euclidean space.Given a finite reflection group (a Coxeter group) G, there is a standard way to give G a group action on n-space.Reflecting a point through this group action yieldsan object that exhibits the symmetries specified by G. If the pointis chosen well, the object is guaranteed to be regular or semiregular,and many interesting regular and semiregular objectsarise this way. By starting with the symmetry group, I can use thegroup structure both to simplify the actual graphics involved withdisplaying the object, and to illustrate various aspects of itsstructure. For example, subgroups of the symmetry group (and theircosets) correspond to substructures of the object. Conversely, bydisplaying such symmetric objects and their various substructures, Ifind that I can elucidate the structure of the symmetry group thatgives rise to them.I have written The Symmetriad, the computer system whose name thisdocument has inherited, and used it to explore 3- and 4-dimensionalsymmetric objects and their symmetry groups. The 3-dimensionalobjects are already well understood, but they serve to illustrate thetechniques used on the 4-dimensional objects and make them morecomprehensible. Four dimensions offers a treasure trove of intriguingstructures, many of which have no ready 3D analogue. These are what Iwill show you here. MEng thesis

Task-Structured Probabilistic I/O Automata

2006年3月31日 00:00:00 GMT

Task-Structured Probabilistic I/O Automata Canetti, Ran; Cheung, Ling; Kaynar, Dilsun; Liskov, Moses; Lynch, Nancy; Pereira, Olivier; Segala, Roberto In the Probabilistic I/O Automata (PIOA) framework, nondeterministicchoices are resolved using perfect-information schedulers,which are similar to history-dependent policies for Markov decision processes(MDPs). These schedulers are too powerful in the setting of securityanalysis, leading to unrealistic adversarial behaviors. Therefore, weintroduce in this paper a novel mechanism of task partitions for PIOAs.This allows us to define partial-information adversaries in a systematicmanner, namely, via sequences of tasks.The resulting task-PIOA framework comes with simple notions of externalbehavior and implementation, and supports simple compositionalityresults. A new type of simulation relation is defined and proven soundwith respect to our notion of implementation. To illustrate the potentialof this framework, we summarize our verification of an ObliviousTransfer protocol, where we combine formal and computational analyses.Finally, we present an extension with extra expressive power, usinglocal schedulers of individual components.

Maximum Entropy Correlated Equilibria

2006年3月20日 00:00:00 GMT

Maximum Entropy Correlated Equilibria Ortiz, Luis E.; Schapire, Robert E.; Kakade, Sham M. We study maximum entropy correlated equilibria in (multi-player)games and provide two gradient-based algorithms that are guaranteedto converge to such equilibria. Although we do not provideconvergence rates for these algorithms, they do have strong connectionsto other algorithms (such as iterative scaling) which are effectiveheuristics for tasks such as statistical estimation.

Pyramid Match Kernels: Discriminative Classification with Sets of Image Features (version 2)

2006年3月18日 00:00:00 GMT

Pyramid Match Kernels: Discriminative Classification with Sets of Image Features (version 2) Grauman, Kristen; Darrell, Trevor Discriminative learning is challenging when examples are sets of features, and the sets vary in cardinality and lack any sort of meaningful ordering. Kernel-based classification methods can learn complex decision boundaries, but a kernel over unordered set inputs must somehow solve for correspondences -- generally a computationally expensive task that becomes impractical for largeset sizes. We present a new fast kernel function which maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in this space. This ``pyramid match" computation is linear in the number of features, and it implicitly finds correspondences based on the finest resolution histogram cell where a matched pair first appears. Since the kerneldoes not penalize the presence of extra features, it is robust to clutter. We show the kernel function is positive-definite, making it valid for use in learning algorithms whose optimal solutions are guaranteed only for Mercer kernels. We demonstrate our algorithm on object recognition tasks and show it to be accurate and dramatically faster than current approaches. (This tech report updates MIT-CSAIL-TR-2005-017 and the paper "The Pyramid Match Kernel: Discriminative Classification with Sets of Images Features" which appeared in the proceedings of ICCV 2005.)

Computing action equivalences for planning under time-constraints

2006年3月20日 00:00:00 GMT

Computing action equivalences for planning under time-constraints Gardiol, Natalia H.; Kaelbling, Leslie Pack In order for autonomous artificial decision-makers to solverealistic tasks, they need to deal with the dual problems of searching throughlarge state and action spaces under time pressure.We study the problem of planning in domains with lots of objects. Structuredrepresentations of action can help provide guidance when the number of actionchoices and size of the state space is large.We show how structured representations ofaction effects can help us partition the action space in to a smallerset of approximate equivalence classes. Then, the pared-downaction space can be used to identify a useful subset of the state space in whichto search for a solution. As computational resources permit, we thenallow ourselves to elaborate the original solution. This kind of analysisallows us to collapse the action space and permits faster planning in muchlarger domains than before.

DNA Binding and Games

2006年3月06日 00:00:00 GMT

DNA Binding and Games Perez-Breva, Luis; Ortiz, Luis E.; Yeang, Chen-Hsiang, 1969-; Jaakkola, Tommi We propose a game-theoretic approach tolearn and predict coordinate binding of multiple DNA bindingregulators. The framework implements resource constrainedallocation of proteins to local neighborhoods as well as to sitesthemselves, and explicates coordinate and competitive bindingrelations among proteins with affinity to the site or region. The focus of this paper is on mathematical foundationsof the new modeling approach. We demonstrate the approachin the context of the lambda-phage switch, a well-known biologicalsubsystem, and provide simulation results that successfully illustrate the predictions that can be derived from the modelwith known structure and affinities. Subsequentwork will elaborate on methods for learning the affinities and gamestructures from available binding data.

Using Task-Structured Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol

2006年3月08日 00:00:00 GMT

Using Task-Structured Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol Canetti, Ran; Cheung, Ling; Kaynar, Dilsun; Liskov, Moses; Lynch, Nancy; Pereira, Olivier; Segala, Roberto AbstractThe Probabilistic I/O Automata framework of Lynch, Segala and Vaandrager provides tools for precisely specifying protocols and reasoning about their correctness using multiple levels of abstraction, based on implementation relationships between these levels. We enhance this framework to allow analyzing protocols that use cryptographic primitives. This requires resolving and reconciling issues such as nondeterministic behavior and scheduling, randomness, resource-bounded computation, and computational hardness assumptions. The enhanced framework allows for more rigorous and systematic analysis of cryptographic protocols. To demonstrate the use of this framework, wepresent an example analysis that we have done for an Oblivious Transfer protocol.

Hyperglue: Designing High-Level Agent Communication for Distributed Applications

2006年3月01日 00:00:00 GMT

Hyperglue: Designing High-Level Agent Communication for Distributed Applications Peters, Stephen; Look, Gary; Quigley, Kevin; Shrobe, Howard; Gajos, Krzysztof We are building a new communication model and discoverysystem which will allow agent-based intelligent spacesto interact with one another. This new infrastructure layer,called Hyperglue, coordinates agent actions at a higher levelthan most agent communication does, providing an interfacefor communication at the level of "real-world" entities suchas people, places, organizations, and information sources.The resulting structure is one which allows these agent communitiesto interact, while preserving the privacy, privileges,and preferences of the entities they represent. In this paperwe describe the rationale for Hyperglue, and present theinitial design as an extension of the existing Metaglue agentframework developed at the MIT AI Lab.

Plan-Driven Pervasive Computing

2006年3月01日 00:00:00 GMT

Plan-Driven Pervasive Computing Look, Gary; Peters, Stephen; Shrobe, Howard The goal of human-centered, pervasive computing should be to hide the details of the computing environment, allowing users to concentrate on their goals, rather than on the direct management of devices. This paper describes a system that operates at the level of goals and plans, rather than individual resources. It adaptively selects from its plan library that plan which is likely to best achieve the userÂ’s goal in view of his preferences and current resource availability. Once the plan and resources are selected, it monitors the execution of the plan, dispatching subtasks when they are ready to be executed.

Amorphous Infrastructure for Language Implementation

2002年12月10日 00:00:00 GMT

Amorphous Infrastructure for Language Implementation Newton, Ryan; Beal, Jacob We propose a method for the robust implementation of simple graphical automataon an amorphous computer. This infrastructure is applied to the implementationof purely functional programming languages. Specifically, it is usedin conjunction with data-flow techniques to implement a toy language homologousto recurrence equations, exploiting control-flow parallelism through paralleloperand evaluation. Also, data parallelism is explored in a separate implementation,in which a simple mark-up syntax enables Scheme programs to performspatially-distributed tree-walking without modifying their semantics. This additionenables an idiomatically expressed interpreter to be trivially instrumented,producing a spatially distributed universal machine, and once again achievingcontrol flow parallelism in the interpreted language.

A soft touch: Compliant Tactile Sensors for Sensitive Manipulation

2006年3月01日 00:00:00 GMT

A soft touch: Compliant Tactile Sensors for Sensitive Manipulation Torres-Jara, Eduardo; Vasilescu, Iuliu; Coral, Raul We present the design, analysis and construction of a biologicallyinspired tactile sensor. The sensor can measure normal and lateralforces, conform to the surfaces with which it comes in contact andincrease the friction of the surface for a good grasp.The sensor is built using a simple process and the applied forcesare read using standard electronics. These features make thesensors ideal for mass production.We are motivated to build tactile sensors that are useful forrobotic manipulation given that the current ones do not have thefeatures that we consider necessary. The sensors presented in thispaper have been designed to deal with these issues. They have beendesigned and implemented in the fingers of the humanoid robotObrero.

Finite Horizon Control Design for Optimal Discrimination between Several Models

2006年2月28日 00:00:00 GMT

Finite Horizon Control Design for Optimal Discrimination between Several Models Blackmore, Lars; Williams, Brian Multiple-Model fault detection is a powerful method for detecting changes, such as faults, in dynamic systems. In many cases, the ability of such a detection scheme to distinguish between possible models for the system dynamics depends critically on the control inputs applied to the system. Prior work has therefore aimed to design control inputs in order to improve fault detection. We previously developed a new method that uses constrained finite horizon control design to create control inputs that minimize an upper bound on the probability of model selection error. This method is limited, however, to the problem of selection between two models. In this paper we describe a new method that extends this approach to handle an arbitrary number of models. By optimizing subject to hard constraints, the new method can ensure that a defined task is fulfilled, while optimally discriminating between models. This means that the discrimination power of the designed control input can be much greater than that created by other approaches, which typically design Â‘auxiliaryÂ’ signals with limited power so that the effect on the system state is small. Furthermore, the optimization criterion, which is an upper bound on the probability of model selection error, has a more meaningful interpretation than alternative approaches that are based on information gain, for example.We demonstrate the method using an aircraft fault detectionscenario and show that the new method significantly reducesthe bound on the probability of error when compared to amanually generated identification sequence and a fuel-optimalsequence.

Interactive Animation of Dynamic Manipulation

2006年2月28日 00:00:00 GMT

Interactive Animation of Dynamic Manipulation Abe, Yeuhi; Popovic, Jovan Lifelike animation of manipulation must account for the dynamicinteraction between animated characters, objects, and their environment. Failing to do so would ignore the often significant effects objectshave on the motion of the character. For example, lifting a heavy objectwould appear identical to lifting a light one. Physical simulationhandles such interaction correctly, with a principled approach thatadapts easily to different circumstances, changing environments, andunexpected disturbances. Our work shows how to control lifelike animatedcharacters so that they accomplish manipulation tasks within aninteractive physical simulation. Our new multi-task control algorithmsimplifies descriptions of manipulation by supporting prioritized goalsin both the joint space of the character and the task-space of theobject. The end result is a versatile algorithm that incorporatesrealistic force limits and recorded motion postures to portray lifelikemanipulation automatically.

Control and Estimation for Cooperative Manipulator Tasks

2006年2月28日 00:00:00 GMT

Control and Estimation for Cooperative Manipulator Tasks Blackmore, Lars; Block, Steve The objective of this project is to achieve reliable transfer of an object from one robotic manipulator to another. This capability is useful for a number of applications, for instance robotic assembly, or robots with multiple manipulators, such as humanoid robots.Achieving reliable object transfer poses a number of challenges for both control and estimation. As with most manipulation problems, the inverse kinematics problem must be solved so that the desired endpoint location can be specified in Cartesian coordinates, rather than in the joint space of the manipulator. Anadditional challenge particular to the cooperative robotics problem is that more than one manipulator may have a grasp on the same object. Manipulators that are carrying out simple position control may encounter problems when grasping the same object. Minor errors in forward kinematics can lead to large controllerforces, or even unstable dynamics, as each controller tries to counteract the other to drive the perceived error to zero.On the estimation side, carrying out reliable transfer depends critically on determining the grasp state; in other words, does a particular robot have a grasp on the object, or do both have the object? The grasp state must be determined before the sequence of events in a transfer task can proceed. For example, the manipulator receiving the object cannot move away until it is certain that the manipulator passing the object has released. In many instances, having pressure sensors mounted in the hand is infeasible. For example, packaging reasons can mean that the necessary space is not available, as is the case with the JPL LEMUR hexapod. We therefore need to infer the grasp state from the available observations, which are usually supplied by position encoders at the joints.For this project we assume that each manipulator carries out estimation independently, without joint angle observations from the other robot, but with knowledge of its own joint angles and of the commands to be issued to both robots. This is typical of a multi-agent cooperative task, and the lack of observations makes the estimation task even more challenging.This report describes the approach we use to solve this problem, which is comprised of an impedance controller and a hybrid estimator.

Encrypted Keyword Search in a Distributed Storage System

2006年2月23日 00:00:00 GMT

Encrypted Keyword Search in a Distributed Storage System Artzi, Shay; Kiezun, Adam; Newport, Calvin; Schultz, David Encrypted keyword search allows a server to perform a search over a set of encrypted documents on behalf of a client without learning the contents of the documents or the words being searched for. Designing a practical system is challenging because the privacy constraint thwarts standard indexing and ranking techniques. We present Mafdet, an encrypted keyword search system we have implemented. Our system makes the search practical even for large data sets. We evaluated Mafdet's performance on a set of queries and a large collection of documents. In these queries, Mafdet's accuracy is within 6% of Google Desktop, and the search time is on the order of seconds for document sets as large as 2.6 GB.

Network Coding Made Practical

2006年2月16日 00:00:00 GMT

Network Coding Made Practical Katti, Sachin; Rahul, Hariharan; Hu, Wenjun; Katabi, Dina; Crowcroft, Jon We propose a new architecture for wireless mesh networks. In addition to forwarding packets, routers mix (i.e., code) packets from different sources to increase the information content of each transmission. We show that intelligently mixing packets increases network throughput. Our design is rooted in the theory of network coding. In contrast to prior work on network coding, which is mainly theoretical and focuses on multicast traffic, ours is practical and solves the common case of unicast traffic. We present the first implementation of network coding in a wireless network. Our system introduces a coding layer between the IP and MAC layers. It works with UDP and TCP traffic, and hence seamlessly integrates with existing applications. We evaluate our design on a 34-node wireless testbed and show that it delivers a 3-4x increase in the throughput ofwireless mesh networks.

Learning Semantic Scene Models by Trajectory Analysis

2006年2月10日 00:00:00 GMT

Learning Semantic Scene Models by Trajectory Analysis Wang, Xiaogang; Tieu, Kinh; Grimson, Eric In this paper, we describe an unsupervised learning framework to segment a scene into semantic regions and to build semantic scene models from long-term observations of moving objects in the scene. First, we introduce two novel similarity measures for comparing trajectories in far-field visual surveillance. The measures simultaneously compare the spatial distribution of trajectories and other attributes, such as velocity and object size, along the trajectories. They also pro-vide a comparison confidence measure which indicates how well the measured im-age-based similarity approximates true physical similarity. We also introduce novel clustering algorithms which use both similarity and comparison confidence. Based on the proposed similarity measures and clustering methods, a framework to learn semantic scene models by trajectory analysis is developed. Trajectories are first clustered into vehicles and pedestrians, and then further grouped based on spatial and velocity distributions. Different trajectory clusters represent different activities. The geometric and statistical models of structures in the scene, such as roads, walk paths, sources and sinks, are automatically learned from the trajectory clusters. Abnormal activities are detected using the semantic scene models. The system is robust to low-level tracking errors.

Transparent Accountable Data Mining: New Strategies for Privacy Protection

2006年1月27日 00:00:00 GMT

Transparent Accountable Data Mining: New Strategies for Privacy Protection Weitzner, Daniel J.; Abelson, Harold; Berners-Lee, Tim; Hanson, Chris; Hendler, James; Kagal, Lalana; McGuinness, Deborah L.; Sussman, Gerald Jay; Waterman, K. Krasnow Attempts to address issues of personal privacy in a world of computerized databases and information networks -- from security technology to data protection regulation to Fourth Amendment law jurisprudence -- typically proceed from the perspective of controlling or preventing access to information. We argue that this perspective has become inadequate and obsolete, overtaken by the ease of sharing and copying data and of aggregating and searching across multiple data bases, to reveal private information from public sources. To replace this obsolete framework, we propose that issues of privacy protection currently viewed in terms of data access be re-conceptualized in terms of data use. From a technology perspective, this requires supplementing legal and technical mechanisms for access control with new mechanisms for transparency and accountability of data use. In this paper, we present a technology infrastructure -- the Policy Aware Web -- that supports transparent and accountable data use on the World Wide Web, and elements of a new legal and regulatory regime that supports privacy through provable accountability to usage rules rather than merely data access restrictions.

A Consistency Management Layer for Inter-Domain Routing

2006年1月27日 00:00:00 GMT

A Consistency Management Layer for Inter-Domain Routing Kushman, Nate; Katabi, Dina; Wroclawski, John This paper proposes an isolation layer -- a shim -- betweeninter-domain routing and packet forwarding. The job of this layer isto coordinate between Autonomous Systems (AS's) on when and how tomodify the forwarding state to ensure inter-domain routing loops donot cause forwarding loops. The benefits of a consistency layer aretwofold. First, it prevents the creation of transient inter-domainforwarding loops and the resulting packet loss, high latency, andconnection failures.Second, by taking the burden of forwarding consistency off theinter-domain routing protocol, it enables inter-domain routingprotocols with more complex convergence characteristics than BGP, suchas protocols that optimize route selection based on performance. Weoffer two possible designs for the consistency layer. We prove thatboth designs are free of forwarding loops and show they are easy todeploy in the current Internet.

A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images

2006年1月25日 00:00:00 GMT

A Unified Information Theoretic Framework for Pair- and Group-wise Registration of Medical Images Zollei, Lilla The field of medical image analysis has been rapidly growing for the past two decades. Besides a significant growth in computational power, scanner performance, and storage facilities, this acceleration is partially due to an unprecedented increase in the amount of data sets accessible for researchers. Medical experts traditionally rely on manual comparisons of images, but the abundance of information now available makes this task increasingly difficult. Such a challenge prompts for more automation in processing the images.In order to carry out any sort of comparison among multiple medical images, onefrequently needs to identify the proper correspondence between them. This step allows us to follow the changes that happen to anatomy throughout a time interval, to identify differences between individuals, or to acquire complementary information from different data modalities. Registration achieves such a correspondence. In this dissertation we focus on the unified analysis and characterization of statistical registration approaches.We formulate and interpret a select group of pair-wise registration methods in the context of a unified statistical and information theoretic framework. This clarifies the implicit assumptions of each method and yields a better understanding of their relative strengths and weaknesses. This guides us to a new registration algorithm that incorporates the advantages of the previously described methods. Next we extend the unified formulation with analysis of the group-wise registration algorithms that align a population as opposed to pairs of data sets. Finally, we present our group-wise registration framework, stochastic congealing. The algorithm runs in a simultaneous fashion, with every member of the population approaching the central tendency of the collection at the same time. It eliminates the need for selecting a particular referenceframe a priori, resulting in a non-biased estimate of a digital template. Our algorithm adopts an information theoretic objective function which is optimized via a gradientbased stochastic approximation process embedded in a multi-resolution setting. We demonstrate the accuracy and performance characteristics of stochastic congealing via experiments on both synthetic and real images. PhD thesis

Service Identification in TCP/IP: Well-Known versus Random Port Numbers

2006年1月11日 00:00:00 GMT

Service Identification in TCP/IP: Well-Known versus Random Port Numbers Masiello, Elizabeth The sixteen-bit well-known port number is often overlooked as a network identifier in Internet communications. Its purpose at the most fundamental level is only to demultiplex flows of traffic. Several unintended uses of the port number evolved from associating services with a list of well-known port numbers. This thesis documents those unintended consequences in an effort to describe the port number's influence on Internet players from ISPs to application developers to individual users. Proposals and examples of moving away from well-known port numbers to randomly assigned ones are then presented, with analysis of impacts on the political and economic systems on which Internet communication is dependent. SM thesis

Wide-Area Egomotion Estimation from Known 3D Structure

2006年1月09日 00:00:00 GMT

Wide-Area Egomotion Estimation from Known 3D Structure Koch, Olivier; Teller, Seth We describe an algorithm that takes as inputs a coarse3D model of an environment, and a video sequence acquiredwithin the environment, and produces as output an estimateof the cameraÂ’s 6-DOF egomotion expressed in the coordinatesof the 3D model. Our method has several novelaspects: it performs line-based structure-from-motion; italigns the local line constellation to the known model; andit uses off-line visibility analysis to dramatically acceleratethe alignment process.We present simulation results demonstrating themethodÂ’s operation in a multi-room environment. We showthat the method can estimate metric egomotion accuratelyand could be used for for many minutes of operation andthousands of video frames.

Nuggeteer: Automatic Nugget-Based Evaluation Using Descriptions and Judgements

2006年1月09日 00:00:00 GMT

Nuggeteer: Automatic Nugget-Based Evaluation Using Descriptions and Judgements Marton, Gregory TREC Definition and Relationship questions are evaluated on thebasis of information nuggets that may be contained in systemresponses. Human evaluators provide informal descriptions of eachnugget, and judgements (assignments of nuggets to responses) for eachresponse submitted by participants.The best present automatic evaluation for these kinds of questions isPourpre. Pourpre uses a stemmed unigram similarity of responses withnugget descriptions, yielding an aggregate result that is difficult tointerpret, but is useful for relative comparison. Nuggeteer, bycontrast, uses both the human descriptions and the human judgements,and makes binary decisions about each response, so that the end resultis as interpretable as the official score.I explore n-gram length, use of judgements, stemming, and termweighting, and provide a new algorithm quantitatively comparable to,and qualitatively better than the state of the art.

Polylogarithmic Approximation Algorithm for Non-Uniform Multicommodity Buy-at-Bulk

2005年11月26日 00:00:00 GMT

Polylogarithmic Approximation Algorithm for Non-Uniform Multicommodity Buy-at-Bulk Hajiaghayi, MohammadTaghi; Kortsarz, Guy; Salavatipour, Mohammad R. We consider the non-uniform multicommodity buy-at-bulknetworkdesign problem. In this problem we are given a graph $G(V,E)$withtwo cost functions on the edges, a buy cost $b:E\longrightarrow \RR^+$and a rent cost$r:E\longrightarrow\RR^+,ドル and a set of source-sink pairs$s_i,t_i\in V$ (1ドル\leq i\leq \alpha$)with each pair $i$ having a positivedemand $\delta_i$. Our goal is to designa minimum cost network $G(V,E')$such that for every 1ドル\leq i\leq\alpha,ドル $s_i$ and $t_i$ are in thesameconnected component in $G(V,E')$. Thetotal cost of $G(V,E')$ is the sum ofbuy costs of the edges in $E'$plus sum of total demand going through everyedge in $E'$ times therent cost of that edge. Since the costs of differentedges can bedifferent, we say that the problem is non-uniform. Thefirstnon-trivial approximation algorithm for this problem is due toCharikarand Karagiozova (STOC' 05) whose algorithm has anapproximation guarantee of$\exp(O(\sqrt{\log n\log\log n})),ドルwhen all $\delta_i=1$ and$\exp(O(\sqrt{\log N\log\log N}))$ for the generaldemand case where $N$ isthe sum of all demands. We improve upon this result, bypresenting the firstpolylogarithmic (specifically, $O(\log^4 n)$ for unit demandsand $O(\log^4N)$ for the general demands)approximation for this problem. The algorithmrelies on a recent result\cite{HKS1} for the buy-at-bulk $k$-Steiner treeproblem.

Approximating Buy-at-Bulk k-Steiner trees

2005年11月15日 00:00:00 GMT

Approximating Buy-at-Bulk k-Steiner trees Hajiaghayi, MohammadTaghi; Kortsarz, Guy; Salavatipour, Mohammad R. In the buy-at-bulk $k$-Steiner tree (or rent-or-buy$k$-Steiner tree) problem we are given a graph $G(V,E)$ with a setof terminals $T\subseteq V$ including a particular vertex $s$ calledthe root, and an integer $k\leq |T|$. There are two cost functionson the edges of $G,ドル a buy cost $b:E\longrightarrow \RR^+$ and a rentcost $r:E\longrightarrow \RR^+$. The goal is to find a subtree $H$ of$G$ rooted at $s$ with at least $k$ terminals so that the cost$\sum_{e\in H} b(e)+\sum_{t\in T-s} dist(t,s)$ is minimize, where$dist(t,s)$ is the distance from $t$ to $s$ in $H$ with respect tothe $r$ cost. Our main result is an $O(\log^5 n)$-approximation forthe buy-at-bulk $k$-Steiner tree problem.To achieve this we also design an approximation algorithm forbicriteria $k$-Steiner tree. In the bicriteria $k$-Steiner tree problem weare given a graph $G$ with edge costs $b(e)$ and distance costs$r(e)$ over the edges, and an integer $k$. Our goal is to find aminimum cost (under $b$-cost) $k$-Steiner tree such that thediameter under $r$-cost is at most some given bound $D$. An$(\alpha,\beta)$-approximation finds a subgraph of diameter at most$\alpha\cdot {D}$ (with respect to $r$) and cost with respect to$b$ of at most $\beta\cdot opt$ where $opt$ is the minimum cost ofany solution with diameter at most $D$. Marathe et al \cite{ravi}gave an $(O(\log n),O(\log n))$-approximation algorithm for thebicriteria Steiner tree problem. Their algorithm does not extend tothe bicriteria $k$-Steiner tree problem.Our algorithm for the buy-at-bulk $k$-Steiner tree problem relies on an$(O(\log^2 n),O(\log^4 n))$-approximation algorithm we develop for the(shallow-light) bicriteria $k$-Steiner tree problem, which is ofindependent interest. Indeed, this is also one of the main tools we use to obtainthe first polylogarithmic approximation algorithm for non-uniformmulticommodity buy-at-bulk~\cite{HKS}.

Cognitive-Developmental Learning for a Humanoid Robot: A Caregiver's Gift

2004年9月26日 00:00:00 GMT

Cognitive-Developmental Learning for a Humanoid Robot: A Caregiver's Gift Arsenio, Artur Miguel The goal of this work is to build a cognitive system for the humanoid robot, Cog, that exploits human caregivers as catalysts to perceive and learn about actions, objects, scenes, people, and the robot itself. This thesis addresses a broad spectrum of machine learning problems across several categorization levels. Actions by embodied agents are used to automatically generate training data for the learning mechanisms, so that the robot develops categorization autonomously. Taking inspiration from the human brain, a framework of algorithms and methodologies was implemented to emulate different cognitive capabilities on the humanoid robot Cog. This framework is effectively applied to a collection of AI, computer vision, and signal processing problems. Cognitive capabilities of the humanoid robot are developmentally created, starting from infant-like abilities for detecting, segmenting, and recognizing percepts over multiple sensing modalities. Human caregivers provide a helping hand for communicating such information to the robot. This is done by actions that create meaningful events (by changing the world in which the robot is situated) thus inducing the "compliant perception" of objects from these human-robot interactions. Self-exploration of the world extends the robot's knowledge concerning object properties.This thesis argues for enculturating humanoid robots using infant development as a metaphor for building a humanoid robot's cognitive abilities. A human caregiver redesigns a humanoid's brain by teaching the humanoid robot as she would teach a child, using children's learning aids such as books, drawing boards, or other cognitive artifacts. Multi-modal object properties are learned using these tools and inserted into several recognition schemes, which are then applied to developmentally acquire new object representations. The humanoid robot therefore sees the world through the caregiver's eyes.Building an artificial humanoid robot's brain, even at an infant's cognitive level, has been a long quest which still lies only in the realm of our imagination. Our efforts towards such a dimly imaginable task are developed according to two alternate and complementary views: cognitive and developmental.

Electronic Cash with Blind Deposits: How to Have No Spare Change

2003年10月14日 00:00:00 GMT

Electronic Cash with Blind Deposits: How to Have No Spare Change Liskov, Moses Electronic cash schemes in which the bank authenticates many coins at once suffer from the problem that coins that are authenticated together can be linked to one another. Unfortunately, unless a user spends coins in a closely prescribed manner, different batches of coins ("wallets") will be linked together in these schemes. This is illustrated by the problem of what a customer does with the "spare change" - an unusable small amount of money left in a wallet. We propose a new protocol to be used in e-cash schemes: blind deposits. In a blind deposit, a customer returns a coin to the bank without revealing the coin. We present a secure and efficient e-cash scheme with this added feature based on that of Liskov-Micali [LM01].

Generating Trees of (Reducible) 1324-avoiding Permutations

2003年10月09日 00:00:00 GMT

Generating Trees of (Reducible) 1324-avoiding Permutations Marinov, Darko; Rodoicic, Rados We consider permutations that avoid the pattern 1324. We give exact formulas for thenumber of reducible 1324-avoiding permutations and the number of {1324, 4132, 2413, 3241}-avoiding permutations. By studying the generating tree for all 1324-avoiding permutations,we obtain a recurrence formula for their number. A computer program provides data for thenumber of 1324-avoiding permutations of length up to 20.

Error weighted classifier combination for multi-modal human identification

2005年12月14日 00:00:00 GMT

Error weighted classifier combination for multi-modal human identification Ivanov, Yuri; Serre, Thomas; Bouvrie, Jacob In this paper we describe a technique of classifier combination used in a human identification system. The system integrates all available features from multi-modal sources within a Bayesian framework. The framework allows representinga class of popular classifier combination rules and methods within a single formalism. It relies on a Â“per-classÂ” measure of confidence derived from performance of each classifier on training data that is shown to improve performance on a synthetic data set. The method is especially relevant in autonomous surveillance setting where varying time scales and missing features are a common occurrence. We show an application of this technique to the real-world surveillance database of video and audio recordings of people collected over several weeks in the office setting.

Automatic Software Upgrades for Distributed Systems

2005年11月30日 00:00:00 GMT

Automatic Software Upgrades for Distributed Systems Ajmani, Sameer Upgrading the software of long-lived, highly-available distributed systems is difficult. It is not possible to upgrade all the nodes in a system at once, since some nodes may be unavailable and halting the system for an upgrade is unacceptable. Instead, upgrades may happen gradually, and there may be long periods of time when different nodes are running different software versions and need to communicate using incompatible protocols. We present a methodology and infrastructure that address these challenges and make it possible to upgrade distributed systems automatically while limiting service disruption.Our methodology defines how to enable nodes to interoperate across versions, how to preserve the state of a system across upgrades, and how to schedule an upgrade so as to limit service disruption. The approach is modular: defining an upgrade requires understanding only the new software and the version it replaces.The upgrade infrastructure is a generic platform for distributing and installing software while enabling nodes to interoperate across versions. The infrastructure requires no access to the system source code and is transparent: node software is unaware that different versions even exist. We have implemented a prototype of the infrastructure called Upstart that intercepts socket communication using a dynamically-linked C++ library. Experiments show that Upstart has low overhead and works well for both local-area and Internet systems.

Conditional Random People: Tracking Humans with CRFs and Grid Filters

2005年12月01日 00:00:00 GMT

Conditional Random People: Tracking Humans with CRFs and Grid Filters Taycher, Leonid; Shakhnarovich, Gregory; Demirdjian, David; Darrell, Trevor We describe a state-space tracking approach based on a Conditional Random Field(CRF) model, where the observation potentials are \emph{learned} from data. Wefind functions that embed both state and observation into a space wheresimilarity corresponds to $L_1$ distance, and define an observation potentialbased on distance in this space. This potential is extremely fast to compute and in conjunction with a grid-filtering framework can be used to reduce acontinuous state estimation problem to a discrete one. We show how a statetemporal prior in the grid-filter can be computed in a manner similar to asparse HMM, resulting in real-time system performance. The resulting system isused for human pose tracking in video sequences.

Identifying Expression Fingerprints using Linguistic Information

2005年11月18日 00:00:00 GMT

Identifying Expression Fingerprints using Linguistic Information Uzuner, Ozlem This thesis presents a technology to complement taxation-based policy proposals aimed at addressing the digital copyright problem. Theapproach presented facilitates identification of intellectual propertyusing expression fingerprints. Copyright law protects expression of content. Recognizing literaryworks for copyright protection requires identification of theexpression of their content. The expression fingerprints described inthis thesis use a novel set of linguistic features that capture boththe content presented in documents and the manner of expression usedin conveying this content. These fingerprints consist of bothsyntactic and semantic elements of language. Examples of thesyntactic elements of expression include structures of embedding andembedded verb phrases. The semantic elements of expression consist ofhigh-level, broad semantic categories. Syntactic and semantic elements of expression enable generation ofmodels that correctly identify books and their paraphrases 82% of thetime, providing a significant (approximately 18%) improvement over modelsthat use tfidf-weighted keywords. The performance of models builtwith these features is also better than models created with standardfeatures used in stylometry (e.g., function words), which yield anaccuracy of 62%.In the non-digital world, copyright holders collect revenues bycontrolling distribution of their works. Current approaches to thedigital copyright problem attempt to provide copyright holders withthe same kind of control over distribution by employing Digital RightsManagement (DRM) systems. However, DRM systems also enable copyrightholders to control and limit fair use, to inhibit others' speech, andto collect private information about individual users of digitalworks.Digital tracking technologies enable alternate solutions to thedigital copyright problem; some of these solutions can protectcreative incentives of copyright holders in the absence of controlover distribution of works. Expression fingerprints facilitatedigital tracking even when literary works are DRM- and watermark-free,and even when they are paraphrased. As such, they enable meteringpopularity of works and make practicable solutions that encouragelarge-scale dissemination and unrestricted use of digital works andthat protect the revenues of copyright holders, for example throughtaxation-based revenue collection and distribution systems, withoutimposing limits on distribution.

Accurate and Scalable Surface Representation and Reconstruction from Images

2005年11月18日 00:00:00 GMT

Accurate and Scalable Surface Representation and Reconstruction from Images Zeng, Gang; Paris, Sylvain; Quan, Long; Sillion, Francois We introduce a new surface representation, the patchwork, to extend the problem of surface reconstruction from multiple images. A patchwork is the combination of several patches that are built one by one. This design potentially allows the reconstruction of an object of arbitrarily large dimensions while preserving a fine level of detail. We formally demonstrate that this strategy leads to a spatial complexity independent of the dimensions of the reconstructed object, and to a time complexity linear with respect to the object area. The former property ensures that we never run out of storage (memory) and the latter means that reconstructing an object can be done in a reasonable amount of time. In addition, we show that the patchwork representation handles equivalently open and closed surfaces whereas most of the existing approaches are limited to a specific scenario (open or closed surface but not both).Most of the existing optimization techniques can be cast into this framework. To illustrate the possibilities offered by this approach, we propose two applications that expose how it dramatically extends a recent accurate graph-cut technique. We first revisit the popular carving techniques. This results in a well-posed reconstruction problem that still enjoys the tractability of voxel space. We also show how we can advantageously combine several image-driven criteria to achieve a finely detailed geometry by surface propagation. The above properties of the patchwork representation and reconstruction are extensively demonstrated on real image sequences.

Analysis of Perceptron-Based Active Learning

2005年11月17日 00:00:00 GMT

Analysis of Perceptron-Based Active Learning Dasgupta, Sanjoy; Kalai, Adam Tauman; Monteleoni, Claire We start by showing that in an active learning setting, the Perceptron algorithm needs $\Omega(\frac{1}{\epsilon^2})$ labels to learn linear separators within generalization error $\epsilon$. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron update with an adaptive filtering rule for deciding which points to query. For data distributed uniformly over the unit sphere, we show that our algorithm reaches generalization error $\epsilon$ after asking for just $\tilde{O}(d \log \frac{1}{\epsilon})$ labels. This exponential improvement over the usual sample complexity of supervised learning has previously been demonstrated only for the computationally more complex query-by-committee algorithm.

Online Learning of Non-stationary Sequences

2005年11月17日 00:00:00 GMT

Online Learning of Non-stationary Sequences Monteleoni, Claire; Jaakkola, Tommi We consider an online learning scenario in which the learner can make predictions on the basis of a fixed set of experts. We derive upper and lower relative loss bounds for a class of universal learning algorithms involving a switching dynamics over the choice of the experts. On the basis of the performance bounds we provide the optimal a priori discretization of the switching-rate parameter that governs the switching dynamics. We demonstrate the algorithm in the context of wireless networks.

New LSH-based Algorithm for Approximate Nearest Neighbor

2005年11月04日 00:00:00 GMT

New LSH-based Algorithm for Approximate Nearest Neighbor Andoni, Alexandr; Indyk, Piotr We present an algorithm for c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time ofO(dn^{1/c^2+o(1)}) and space O(dn + n^{1+1/c^2+o(1)}).

On Field Constraint Analysis

2005年11月03日 00:00:00 GMT

On Field Constraint Analysis Wies, Thomas; Kuncak, Viktor; Lam, Patrick; Podelski, Andreas; Rinard, Martin We introduce field constraint analysis, a new technique for verifying data structure invariants. A field constraint for a field is a formula specifying a set of objects to which the field can point. Field constraints enable the application of decidable logics to data structures which were originally beyond the scope of these logics, by verifying the backbone of the data structure and then verifying constraints on fields that cross-cut the backbone in arbitrary ways. Previously, such cross-cutting fields could only be verified when they were uniquely determined by the backbone, which significantly limited the range of analyzable data structures. Our field constraint analysis permits \\emph{non-deterministic} field constraints on cross-cutting fields, which allows to verify invariants of data structures such as skip lists. Non-deterministic field constraints also enable the verification of invariants between data structures, yielding an expressive generalization of static type declarations. The generality of our field constraints requires new techniques, which are orthogonal to the traditional use of structure simulation. We present one such technique and prove its soundness. We have implemented this technique as part of a symbolic shape analysis deployed in the context of the Hob system for verifying data structure consistency. Using this implementation we were able to verify data structures that were previously beyond the reach of similar techniques.

Subcontracted Rational SFE

2005年11月02日 00:00:00 GMT

Subcontracted Rational SFE Lepinski, Matthew; Micali, Silvio In their paper, "Rational Secure Computation and Ideal Mechanism Design," Izmalkov, Lepinski and Micali show that any one-shot mediated game can be simulated by the players themselves, without the help of a trusted mediator, using physical envelopes and a ballot-box. We show that communication between the players is not essential to the ILM protocol. That is, we provide a protocol for rational secure function evaluation (Rational SFE) where the players just send a set of envelopes to a referee who simply performs a sequence of publicly verifiable actions. That is, the players can "subcontract" all of the computation to an untrusted referee. In addition to providing a communication structure that more closely matches the ideal game, our protocol also enables us to better simulate mediated games in which abort is not a dominated action.

Towards Realizing the Performance and Availability Benefits of a Global Overlay Network

2005年11月01日 00:00:00 GMT

Towards Realizing the Performance and Availability Benefits of a Global Overlay Network Rahul, Hariharan; Kasbekar, Mangesh; Sitaraman, Ramesh; Berger, Arthur Prior analyses of the benefits of routing overlays are based onplatforms consisting of nodes located primarily in North America, onthe academic Internet, and at the edge of the network. This paper isthe first global study of the benefits of overlays on the commercialInternet in terms of round trip latencies and availability, usingmeasurements from diverse ISPs over 1100 locations (77 countries, 630cities and 6 continents).Our study shows that while overlays provide some improvements in North America, their benefits are especially significant for paths withAsian endpoints. Regarding practical considerations in constructingoverlay routes, we show that an algorithm that randomly chooses asmall number of alternate redundant paths achieves an availability ofover 99.5%. We also propose and evaluate a simple predictive schemethat achieves almost optimal latency using only 2-3 paths, and thatthis is achievable with surprisingly persistent routing choices.

Using Cyclic Memory Allocation to Eliminate Memory Leaks

2005年10月26日 00:00:00 GMT

Using Cyclic Memory Allocation to Eliminate Memory Leaks Nguyen, Huu Hai; Rinard, Martin We present and evaluate a new memory management technique foreliminating memory leaks in programs with dynamic memoryallocation. This technique observes the execution of the program on asequence of training inputsto find m-bounded allocation sites,which have the property that at any time during the execution of theprogram, the program accesses at most only the last m objects allocated atthat site. The technique then transforms the program to usecyclic memory allocation at that site: it preallocates a buffercontaining m objects of the type allocated at that site, with eachallocation returning the next object in the buffer. At the end of thebuffer the allocations wrap back around to the first object. Cyclicallocation eliminates any memory leak at the allocation site - thetotal amount of memory required to hold all of the objects everallocated at the site is simply $m$ times the object size.We evaluate our technique by applying it to several widely-used opensource programs. Our results show that it is able to successfullyeliminate important memory leaks in these programs. A potentialconcern is that the estimated bounds m may be too small, causing theprogram to overlay live objects in memory. Our results indicate thatour bounds estimation technique is quite accurate in practice,providing incorrect results for only one of the 160 m-bounded sitesthat it identifies. To evaluate the potential impact ofoverlaying live objects, we artificially reduce the bounds at$m$-bounded sites and observe the resulting behavior.The resulting overlayingof live objects often does not affect thefunctionality of the program at all; even when it does impairpart of the functionality, the program does not fail andis still able to acceptably deliver the remaining functionality.

MPEG-2 in a Stream Programming Language

2005年10月22日 00:00:00 GMT

MPEG-2 in a Stream Programming Language Drake, Matthew; Hoffmann, Hank; Rabbah, Rodric; Amarasinghe, Saman Image and video codecs are prevalent in multimedia applications, ranging from embedded systems, to desktop computers, to high-end servers such as HDTV editing consoles. It is not uncommon however that developers create (from scratch) and customize their codec implementations for each of the architecture targets they intend their coders and decoders to run on. This practice is time consuming anderror prone, leading to code that is not malleable or portable. In this paper we describe an implementation of the MPEG-2 codec using the StreamIt programming language. StreamIt is an architecture-independent stream language that aims to improve programmer productivity, while concomitantly exposing the inherent parallelism and communication topology of the application. We describe why MPEG is a good match forthe streaming programming model, and illustrate the malleability of the implementation using a simple modification to the decoder to support alternate color compression formats. StreamIt allows for modular application development, which also reduces the complexity of the debugging process since stream components can be verifiedindependently. This in turn leads to greater programmer productivity. We implement a fully functional MPEG-2 decoder in StreamIt. The decoder was developed in eight weeks by a single student programmer who did not have any prior experience with MPEG or other video codecs. Many of the MPEG-2 components were subsequently reused to assemble a JPEG codec.

Asymptotics of Gaussian Regularized Least-Squares

2005年10月20日 00:00:00 GMT

Asymptotics of Gaussian Regularized Least-Squares Lippert, Ross; Rifkin, Ryan We consider regularized least-squares (RLS) with a Gaussian kernel. Weprove that if we let the Gaussian bandwidth $\sigma \rightarrow\infty$ while letting the regularization parameter $\lambda\rightarrow 0,ドル the RLS solution tends to a polynomial whose order iscontrolled by the relative rates of decay of $\frac{1}{\sigma^2}$ and$\lambda$: if $\lambda = \sigma^{-(2k+1)},ドル then, as $\sigma \rightarrow\infty,ドル the RLS solution tends to the $k$th order polynomial withminimal empirical error. We illustrate the result with an example.

Knowledge Flow Analysis for Security Protocols

2005年10月19日 00:00:00 GMT

Knowledge Flow Analysis for Security Protocols Torlak, Emina; van Dijk, Marten; Gassend, Blaise; Jackson, Daniel; Devadas, Srinivas Knowledge flow analysis offers a simple and flexible way to find flaws in security protocols. A protocol is described by a collection of rules constraining the propagation of knowledge amongst principals. Because this characterization corresponds closely to informal descriptions of protocols, it allows a succinct and natural formalization; because it abstracts away message ordering, and handles communications between principals and applications of cryptographic primitives uniformly, it is readily represented in a standard logic. A generic framework in the Alloy modelling language is presented, and instantiated for two standard protocols, and a new key management scheme.

Towards the Prevention of Dyslexia

2005年10月18日 00:00:00 GMT

Towards the Prevention of Dyslexia Geiger, Gadi; Amara, Domenic G Previous studies have shown that dyslexic individuals who supplement windowed reading practice with intensive small-scale hand-eye coordination tasks exhibit marked improvement in their reading skills. Here we examine whether similar hand-eye coordination activities, in the form of artwork performed by children in kindergarten, first and second grades, could reduce the number of students at-risk for reading problems. Our results suggest that daily hand-eye coordination activities significantly reduce the number of students at-risk. We believe that the effectiveness of these activities derives from their ability to prepare the students perceptually for reading.

Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches

2005年10月10日 00:00:00 GMT

Victim Migration: Dynamically Adapting Between Private and Shared CMP Caches Zhang, MIchael; Asanovic, Krste Future CMPs will have more cores and greater onchip cache capacity. The on-chip cache can either be divided into separate private L2 caches for each core, or treated as a large shared L2 cache. Private caches provide low hit latency but low capacity, while shared caches have higher hit latencies but greater capacity. Victim replication was previously introduced as a way of reducing the average hit latency of a shared cache by allowing a processor to make a replica of a primary cache victim in its local slice of the global L2 cache. Although victim replication performs well on multithreaded and single-threaded codes, it performs worse than the private scheme for multiprogrammed workloads where there is little sharing between the different programs running at the same time. In this paper, we propose victim migration, which improves on victim replication by adding an additional set of migration tags on each node which are used to implement an exclusive cache policy for replicas. When a replica has been created on a remote node, it is not also cached on the home node, but only recorded in the migration tags. This frees up space on the home node to store shared global lines or replicas for the local processor. We show that victim migration performs better than private, shared, and victim replication schemes across a range of single threaded, multithreaded, and multiprogrammed workloads, while using less area than a private cache design. Victim migration provides a reduction in average memory access latency of up to 10% over victim replication.

Learning to Trade with Insider Information

2005年10月07日 00:00:00 GMT

Learning to Trade with Insider Information Das, Sanmay This paper introduces algorithms for learning how to trade usinginsider (superior) information in Kyle's model of financial markets.Prior results in finance theory relied on the insider having perfectknowledge of the structure and parameters of the market. I show herethat it is possible to learn the equilibrium trading strategy whenits form is known even without knowledge of the parameters governingtrading in the model. However, the rate of convergence toequilibrium is slow, and an approximate algorithm that does notconverge to the equilibrium strategy achieves better utility whenthe horizon is limited. I analyze this approximate algorithm fromthe perspective of reinforcement learning and discuss the importanceof domain knowledge in designing a successful learning algorithm.

Automatic Software Upgrades for Distributed Systems

2005年10月06日 00:00:00 GMT

Automatic Software Upgrades for Distributed Systems Ajmani, Sameer; Liskov, Barbara; Shrira, Liuba; Curtis, Dorothy Upgrading the software of long-lived, highly-available distributedsystems is difficult. It is not possible to upgrade all the nodes in asystem at once, since some nodes may be unavailable and halting thesystem for an upgrade is unacceptable. Instead, upgrades must happengradually, and there may be long periods of time when different nodesrun different software versions and need to communicate usingincompatible protocols. We present a methodology and infrastructurethat make it possible to upgrade distributed systems automatically whilelimiting service disruption. We introduce new ways to reason aboutcorrectness in a multi-version system. We also describe a prototypeimplementation that supports automatic upgrades with modest overhead.

Secondary Structure Prediction of All-Helical Proteins Using Hidden Markov Support Vector Machines

2005年10月06日 00:00:00 GMT

Secondary Structure Prediction of All-Helical Proteins Using Hidden Markov Support Vector Machines Gassend, B.; O'Donnell, C. W.; Thies, W.; Lee, A.; van Dijk, M.; Devadas, S. Our goal is to develop a state-of-the-art predictor with an intuitive and biophysically-motivated energy model through the use of Hidden Markov Support Vector Machines (HM-SVMs), a recent innovation in the field of machine learning. We focus on the prediction of alpha helices in proteins and show that using HM-SVMs, a simple 7-state HMM with 302 parameters can achieve a Q_alpha value of 77.6% and a SOV_alpha value of 73.4%. We briefly describe how our method can be generalized to predicting beta strands and sheets.

Combining diagrammatic and symbolic reasoning

2005年10月06日 00:00:00 GMT

Combining diagrammatic and symbolic reasoning Arkoudas, Konstantine We introduce a domain-independent framework for heterogeneous natural deduction that combines diagrammatic and sentential reasoning. The framework is presented in the form of a family of denotational proof languages (DPLs). Diagrams are represented as possibly partial descriptions of finite system states. This allows us to dealwith incomplete information, which we formalize by admitting sets as attribute values. We introduce a notion of attribute interpretations that enables us to interpret first-order signatures into such system states, and develop a formal semantic framework based on Kleene\'s strong three-valued logic. We extend the assumption-base semantics of DPLs to accodomodate diagrammatic reasoning by introducing general inference mechanisms for the valid extraction of information from diagrams and for the incorporation of sentential information into diagrams. A rigorous big-step operational semantics is given, on the basis of which we prove that our framework is sound. In addition, we specify detailed algorithms for implementing proof checkers for the resulting languages, and discuss associated efficiency issues.

Spatial and Temporal Abstractions in POMDPs Applied to Robot Navigation

2005年9月27日 00:00:00 GMT

Spatial and Temporal Abstractions in POMDPs Applied to Robot Navigation Theocharous, Georgios; Mahadevan, Sridhar; Kaelbling, Leslie Pack Partially observable Markov decision processes (POMDPs) are a well studied paradigm for programming autonomous robots, where the robot sequentially chooses actions to achieve long term goals efficiently. Unfortunately, for real world robots and other similar domains, the uncertain outcomes of the actions and the fact that the true world state may not be completely observable make learning of models of the world extremely difficult, and using them algorithmically infeasible. In this paper we show that learning POMDP models and planning with them can become significantly easier when we incorporate into our algorithms the notions of spatial and tempral abstraction. We demonstrate the superiority of our algorithms by comparing them with previous flat approaches for large scale robot navigation.

Automated Audio-visual Activity Analysis

2005年9月20日 00:00:00 GMT

Automated Audio-visual Activity Analysis Stauffer, Chris Current computer vision techniques can effectively monitor gross activities in sparse environments. Unfortunately, visual stimulus is often not sufficient for reliably discriminating between many types of activity. In many cases where the visual information required for a particular task is extremely subtle or non-existent, there is often audio stimulus that is extremely salient for a particular classification or anomaly detection task. Unfortunately unlike visual events, independent sounds are often very ambiguous and not sufficient to define useful events themselves. Without an effective method of learning causally-linked temporal sequences of sound events that are coupled to the visual events, these sound events are generally only useful for independent anomalous sounds detection, e.g., detecting a gunshot or breaking glass. This paper outlines a method for automatically detecting a set of audio events and visual events in a particular environment, for determining statistical anomalies, for automatically clustering these detected events into meaningful clusters, and for learning salient temporal relationships between the audio and visual events. This results in a compact description of the different types of compound audio-visual events in an environment.

LabelMe: a database and web-based tool for image annotation

2005年9月08日 00:00:00 GMT

LabelMe: a database and web-based tool for image annotation Russell, Bryan C.; Torralba, Antonio; Murphy, Kevin P.; Freeman, William T. Research in object detection and recognition in cluttered scenes requires large image collections with ground truth labels. The labels should provide information about the object classes present in each image, as well as their shape and locations, and possibly other attributes such as pose. Such data is useful for testing, as well as for supervised learning. This project provides a web-based annotation tool that makes it easy to annotate images, and to instantly sharesuch annotations with the community. This tool, plus an initial set of 10,000 images (3000 of which have been labeled), can be found at http://www.csail.mit.edu/$\sim$brussell/research/LabelMe/intro.html

Using Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol

2005年8月19日 00:00:00 GMT

Using Probabilistic I/O Automata to Analyze an Oblivious Transfer Protocol Canetti, Ran; Cheung, Ling; Kaynar, Dilsun; Liskov, Moses; Lynch, Nancy; Olivier; Segala, Roberto We demonstrate how to carry out cryptographic security analysis ofdistributed protocols within the Probabilistic I/O Automata frameworkof Lynch, Segala, and Vaandrager.This framework provides tools for arguing rigorously about theconcurrency and scheduling aspects of protocols, and about protocolspresented at different levels of abstraction.Consequently, it can help in making cryptographic analysis moreprecise and less susceptible to errors.We concentrate on a relatively simple two-party Oblivious Transferprotocol, in the presence of a semi-honest adversary (essentially, aneavesdropper).For the underlying cryptographic notion of security, we use a versionof Canetti's Universally Composable security.In spite of the relative simplicity of the example, the exercise isquite nontrivial.It requires taking many fundamental issues into account,including nondeterministic behavior, scheduling, resource-boundedcomputation, and computational hardness assumptions for cryptographicprimitives.

Collective Choice with Uncertain Domain Moldels

2005年8月16日 00:00:00 GMT

Collective Choice with Uncertain Domain Moldels Richards, Whitman When groups of individuals make choices among several alternatives, the most compelling social outcome is the Condorcet winner, namely the alternative beating all others in a pair-wise contest. Obviously the Condorcet winner cannot be overturned if one sub-group proposes another alternative it happens to favor. However, in some cases, and especially with haphazard voting, there will be no clear unique winner, with the outcome consisting of a triple of pair-wise winners that each beat different subsets of the alternatives (i.e. a Â“top-cycleÂ”.) We explore the sensitivity of Condorcet winners to various perturbations in the voting process that lead to top-cycles. Surprisingly, variations in the number of votes for each alternative is much less important than consistency in a voterÂ’s view of how alternatives are related. As more and more voterÂ’s preference orderings on alternatives depart from a shared model of the domain, then unique Condorcet outcomes become increasingly unlikely.

Slicing the Onion: Anonymous Routing Without PKI

2005年8月15日 00:00:00 GMT

Slicing the Onion: Anonymous Routing Without PKI Katti, Sachin; Katabi, Dina; Puchala, Katarzyna Recent years have witnessed many proposals for anonymous routing in overlay peer-to-peer networks. The proposed protocols either expose the receiver and the message content, or require the overlay nodes to have public-private key pairs with the public keys known to everyone. In practice, however, key distribution and management are well-known difficultproblems and have crippled any widespread deployment of anonymous routing. This paper uses a combination of information slicing and source routing to provide anonymous communication in a way similar to Onion Routing but without a public key infrastructure (PKI).

Self-Stabilizing Mobile Node Location Management and Message

2005年8月11日 00:00:00 GMT

Self-Stabilizing Mobile Node Location Management and Message Dolev, Shlomi; Lahiani, Limor; Lynch, Nancy; Nolte, Tina We present simple algorithms for achieving self-stabilizing locationmanagement and routing in mobile ad-hoc networks. While mobile clients maybe susceptible to corruption and stopping failures, mobile networks areoften deployed with a reliable GPS oracle, supplying frequent updates ofaccurate real time and location information to mobile nodes. Informationfrom a GPS oracle provides an external, shared source of consistency formobile nodes, allowing them to label and timestamp messages, and henceaiding in identification of, and eventual recovery from, corruption andfailures. Our algorithms use a GPS oracle.Our algorithms also take advantage of the Virtual Stationary Automataprogramming abstraction, consisting of mobile clients, virtual timedmachines called virtual stationary automata (VSAs), and a local broadcastservice connecting VSAs and mobile clients. VSAs are distributed at knownlocations over the plane, and emulated in a self-stabilizing manner by themobile nodes in the system. They serve as fault-tolerant building blocksthat can interact with mobile clients and each other, and can simplifyimplementations of services in mobile networks.We implement three self-stabilizing, fault-tolerant services, each builton the prior services: (1) VSA-to-VSA geographic routing, (2) mobileclient location management, and (3) mobile client end-to-end routing. Weuse a greedy version of the classical depth-first search algorithm toroute messages between VSAs in different regions. The mobile clientlocation management service is based on home locations: Each clientidentifier hashes to a set of home locations, regions whose VSAs areperiodically updated with the client\'s location. VSAs maintain thisinformation and answer queries for client locations. Finally, theVSA-to-VSA routing and location management services are used to implementmobile client end-to-end routing.

Implementing Probabilistically Checkable Proofs of Proximity

2005年8月08日 00:00:00 GMT

Implementing Probabilistically Checkable Proofs of Proximity Bhattacharyya, Arnab Abstract: In this paper, we describe a proof-of-concept implementation of the probabilistically checkable proof of proximity (PCPP) system described by Ben-Sasson and Sudan in \\cite{bs05}. In particular, we implement a PCPP prover and verifier for Reed-Solomon codes; the prover converts an evaluation of a polynomial on a linear set into a valid PCPP, while the verifier queries the evaluation and the PCPP to check that the evaluation is close to a Reed-Solomon codeword. We prove tight bounds on the various parameters associated with the prover and verifier and describe some interesting programmatic issues that arise during their implementation.

On Algorithms and Complexity for Sets with Cardinality Constraints

2005年8月03日 00:00:00 GMT

On Algorithms and Complexity for Sets with Cardinality Constraints Marnette, Bruno; Kuncak, Viktor; Rinard, Martin Typestate systems ensure many desirable properties of imperativeprograms, including initialization of object fields and correct use ofstateful library interfaces. Abstract sets with cardinalityconstraints naturally generalize typestate properties: relationshipsbetween the typestates of objects can be expressed as subset anddisjointness relations on sets, and elements of sets can berepresented as sets of cardinality one. In addition, sets withcardinality constraints provide a natural language for specifyingoperations and invariants of data structures.Motivated by these program analysis applications, thispaper presents new algorithms and new complexity results forconstraints on sets and their cardinalities. We studyseveral classes of constraints and demonstrate a trade-offbetween their expressive power and their complexity.Our first result concerns a quantifier-free fragment of BooleanAlgebra with Presburger Arithmetic. We give a nondeterministicpolynomial-time algorithm for reducing the satisfiability of sets withsymbolic cardinalities to constraints on constant cardinalities, andgive a polynomial-space algorithm for the resulting problem. The bestpreviously existing algorithm runs in exponential space andnondeterministic exponential time.In a quest for more efficient fragments, we identify severalsubclasses of sets with cardinality constraints whose satisfiabilityis NP-hard. Finally, we identify a class of constraints that haspolynomial-time satisfiability and entailment problems and can serveas a foundation for efficient program analysis. We give a system ofrewriting rules for enforcing certain consistency properties of theseconstraints and show how to extract complete information fromconstraints in normal form. This result implies the soundness andcompleteness of our algorithms.

How to Construct a Correct and Scalable iBGP Configuration

2005年8月03日 00:00:00 GMT

How to Construct a Correct and Scalable iBGP Configuration Vutukuru, Mythili; Valiant, Paul; Kopparty, Swastik; Balakrishnan, Hari The Border Gateway Protocol (BGP), the current inter domain routing protocol in the Internet, has two modes of operation: eBGP (External BGP), used to exchange routing information between autonomous systems, and iBGP (Internal BGP), used to propagate that information within an autonomous system (AS). This paper focuses on the construction of an iBGP session configuration that guarantees two correctness properties - loop-free forwarding paths and complete visibility to all eBGP-learned best routes - while attempting to minimize the number of iBGP sessions (for scalability) and ensuring that the constructed configuration guarantees the two correctness properties even in the face of link failures and IGPpath changes. Our algorithm constructs an iBGP configuration based on route reflectors, a commonly used way to control the number of iBGP sessions. The algorithm, BGPSep, uses the notion of a graph separator, a (small) set of nodes that partition a graph into connected components of roughly equal sizes, recursively applies this idea to the connected components, and produces a route reflector hierarchy and the associated iBGP sessions. We prove thatBGPSep guarantees the desired correctness properties, andevaluate an implementation of the BGPSep algorithm on several real-world and simulated network topologies. Across these topologies, we find that the number of iBGP sessions with is afactor of 2.5 to 5 times smaller than with a \"full mesh\" iBGP, while guaranteeing the desired correctness properties.

Proving Atomicity: An Assertional Approach

2005年7月22日 00:00:00 GMT

Proving Atomicity: An Assertional Approach Chockler, Gregory; Lynch, Nancy; Mitra, Sayan; Tauber, Joshua Atomicity (or linearizability) is a commonly used consistency criterion for distributed services and objects. Although atomic object implementations are abundant, proving that algorithms achieve atomicity has turned out to be a challenging problem. In this paper, we initiate the study of systematic ways of verifying distributed implementations of atomic objects, beginning with read/write objects (registers). Our general approach is to replace the existing operational reasoning about events and partial orders with assertional reasoning about invariants and simulation relations. To this end, we define an abstract state machine that captures the atomicity property and prove correctness of the object implementations by establishing a simulation mapping between the implementation and the specification automata. We demonstrate the generality of our specification by showing that it is implemented by three different read/write register constructions (the message-passing register emulation of Attiya, Bar-Noy and Dolev, its optimized version based on real time, and the shared memory register construction of Vitanyi and Awerbuch), and by a general atomic object implementation based on the Lamport\'s replicated state machine algorithm.

Byzantine Clients Rendered Harmless

2005年7月21日 00:00:00 GMT

Byzantine Clients Rendered Harmless Liskov, Barbara; Rodrigues, Rodrigo Byzantine quorum systems have been proposed that work properly even when up to f replicas fail arbitrarily.However, these systems are not so successful when confronted with Byzantine faulty clients. This paper presents novelprotocols that provide atomic semantics despite Byzantine clients. Our protocols are the first to handle all problemscaused by Byzantine clients. They prevent Byzantine clients from interfering with good clients: bad clients cannotprevent good clients from completing reads and writes, and they cannot cause good clients to see inconsistencies. Inaddition we also prevent bad clients that have been removed from operation from leaving behind more than a boundednumber of writes that could be done on their behalf by a colluder.Our protocols are designed to work in an asynchronous system like the Internet and they are highly efficient. Werequire 3f +1 replicas, and either two or three phases to do writes; reads normally complete in one phase and requireno more than two phases, no matter what the bad clients are doing.We also present strong correctness conditions for systems with Byzantine clients that limit what can be done onbehalf of bad clients once they leave the system. Furthermore we prove that our protocols are both safe (they meetthose conditions) and live.

Boosting a Biologically Inspired Local Descriptor for Geometry-free Face and Full Multi-view 3D Object Recognition

2005年7月07日 00:00:00 GMT

Boosting a Biologically Inspired Local Descriptor for Geometry-free Face and Full Multi-view 3D Object Recognition Yokono, Jerry Jun; Poggio, Tomaso Object recognition systems relying on local descriptors are increasingly used because of their perceived robustness with respect to occlusions and to global geometrical deformations. Descriptors of this type -- based on a set of oriented Gaussian derivative filters -- are used in our recognition system. In this paper, we explore a multi-view 3D object recognition system that does not use explicit geometrical information. The basic idea is to find discriminant features to describe an object across different views. A boosting procedure is used to select features out of a large feature pool of local features collected from the positive training examples. We describe experiments on face images with excellent recognition rate.

Ultra-fast Object Recognition from Few Spikes

2005年7月06日 00:00:00 GMT

Ultra-fast Object Recognition from Few Spikes Hung, Chou; Kreiman, Gabriel; Poggio, Tomaso; DiCarlo, James J. Understanding the complex brain computations leading to object recognition requires quantitatively characterizing the information represented in inferior temporal cortex (IT), the highest stage of the primate visual stream. A read-out technique based on a trainable classifier is used to characterize the neural coding of selectivity and invariance at the population level. The activity of very small populations of independently recorded IT neurons (~100 randomly selected cells) over very short time intervals (as small as 12.5 ms) contains surprisingly accurate and robust information about both object Â‘identityÂ’ and Â‘categoryÂ’, which is furthermore highly invariant to object position and scale. Significantly, selectivity and invariance are present even for novel objects, indicating that these properties arise from the intrinsic circuitry and do not require object-specific learning. Within the limits of the technique, there is no detectable difference in the latency or temporal resolution of the IT information supporting so-called Â‘categorizationÂ’ (a.k. basic level) and Â‘identificationÂ’ (a.k. subordinate level) tasks. Furthermore, where information, in particular information about stimulus location and scale, can also be read-out from the same small population of IT neurons. These results show how it is possible to decode invariant object information rapidly, accurately and robustly from a small population in IT and provide insights into the nature of the neural code for different kinds of object-related information.

Etna: a Fault-tolerant Algorithm for Atomic Mutable DHT Data

2005年6月15日 00:00:00 GMT

Etna: a Fault-tolerant Algorithm for Atomic Mutable DHT Data Muthitacharoen, Athicha; Gilbert, Seth; Morris, Robert This paper presents Etna, an algorithm for atomic reads and writes of replicated data stored in a distributed hash table. Etna correctly handles dynamically changing sets of replica hosts, and is optimized for reads, writes, and reconfiguration, in that order.Etna maintains a series of replica configurations as nodes in the system change, using new sets of replicas from the pool supplied by the distributed hash table system. It uses the Paxos protocol to ensure consensus on the members of each new configuration. For simplicity and performance, Etna serializes all reads and writes through a primary during the lifetime of each configuration. As a result, Etna completes read and write operations in only a single round from the primary.Experiments in an environment with high network delaysshow that Etna's read latency is determined by round-tripdelay in the underlying network, while write and reconfiguration latency is determined by the transmission time required to send data to each replica. Etna's write latency is about the same as that of a non-atomic replicating DHT, and Etna's read latency is about twice that of a non-atomic DHT due to Etna assembling a quorum for every read.

Autonomous Virtual Mobile Nodes

2005年6月15日 00:00:00 GMT

Autonomous Virtual Mobile Nodes Dolev, Shlomi; Gilbert, Seth; Schiller, Elad; Shvartsman, Alex; Welch, Jennifer This paper presents a new abstraction for virtual infrastructure in mobile ad hoc networks. An AutonomousVirtual Mobile Node (AVMN) is a robust and reliable entity that is designed to cope with theinherent difficulties caused by processors arriving, leaving, and moving according to their own agendas,as well as with failures and energy limitations. There are many types of applications that may make useof the AVMN infrastructure: tracking, supporting mobile users, or searching for energy sources.The AVMN extends the focal point abstraction in [9] and the virtual mobile node abstraction in [10].The new abstraction is that of a virtual general-purpose computing entity, an automaton that can makeautonomous on-line decisions concerning its own movement. We describe a self-stabilizing implementationof this new abstraction that is resilient to the chaotic behavior of the physical processors and providesautomatic recovery from any corrupted state of the system.

Automatic Test Factoring for Java

2005年6月08日 00:00:00 GMT

Automatic Test Factoring for Java Saff, David; Artzi, Shay; Perkins, Jeff H.; Ernst, Michael D. Test factoring creates fast, focused unit tests from slow system-widetests; each new unit test exercises only a subset of the functionalityexercised by the system test. Augmenting a test suite with factoredunit tests should catch errors earlier in a test run.One way to factor a test is to introduce 'mock' objects. If a testexercises a component T, which interacts with another component E (the'environment'), the implementation of E can be replaced by a mock.The mock checks that T's calls to E are as expected, and it simulatesE's behavior in response. We introduce an automatic technique fortest factoring. Given a system test for T and E, and a record of T'sand E's behavior when the system test is run, test factoring generatesunit tests for T in which E is mocked. The factored tests can isolatebugs in T from bugs in E and, if E is slow or expensive, improve testperformance or cost.We have built an implementation of automatic dynamic test factoring for theJava language. Our experimental data indicates that it can reduce therunning time of a system test suite by up to an order of magnitude.

Nonlinear Latent Variable Models for Video Sequences

2005年6月06日 00:00:00 GMT

Nonlinear Latent Variable Models for Video Sequences rahimi, ali; recht, ben; darrell, trevor Many high-dimensional time-varying signals can be modeled as a sequence of noisy nonlinear observations of a low-dimensional dynamical process. Given high-dimensional observations and a distribution describing the dynamical process, we present a computationally inexpensive approximate algorithm for estimating the inverse of this mapping. Once this mapping is learned, we can invert it to construct a generative model for the signals. Our algorithm can be thought of as learning a manifold of images by taking into account the dynamics underlying the low-dimensional representation of these images. It also serves as a nonlinear system identification procedure that estimates the inverse of the observation function in nonlinear dynamic system. Our algorithm reduces to a generalized eigenvalue problem, so it does not suffer from the computational or local minimum issues traditionally associated with nonlinear system identification, allowing us to apply it to the problem of learning generative models for video sequences.

Theoretical Analysis of Geographic Routing in Social Networks

2005年6月03日 00:00:00 GMT

Theoretical Analysis of Geographic Routing in Social Networks Kumar, Ravi; Liben-Nowell, David; Novak, Jasmine; Raghavan, Prabhakar; Tomkins, Andrew We introduce a formal model for geographic social networks, and introduce the notion of rank-based friendship, in which the probability that a person v is a friend of a person u is inversely proportional to the number of people w who live closer to u than v does. We then prove our main theorem, showing that rank-based friendship is a sufficient explanation of the navigability of any geographic social network that adheres to it.

A Novel Active Contour Framework. Multi-component Level Set Evolution under Topology Control

2005年6月01日 00:00:00 GMT

A Novel Active Contour Framework. Multi-component Level Set Evolution under Topology Control Segonne, Florent; Pons, Jean-Philippe; Fischl, Bruce; Grimson, Eric We present a novel framework to exert a topology control over a level set evolution. Level set methods offer several advantages over parametric active contours, in particular automated topological changes. In some applications, where some a priori knowledge of the target topology is available, topological changes may not be desirable. A method, based on the concept of simple point borrowed from digital topology, was recently proposed to achieve a strict topology preservation during a level set evolution. However, topologically constrained evolutions often generate topological barriers that lead to large geometric inconsistencies. We introduce a topologically controlled level set framework that greatly alleviates this problem. Unlike existing work, our method allows connected components to merge, split or vanish under some specific conditions that ensure that no topological defects are generated. We demonstrate the strength of our method on a wide range of numerical experiments.

Simultaneous Localization and Tracking in Wireless Ad-hoc Sensor Networks

2005年5月31日 00:00:00 GMT

Simultaneous Localization and Tracking in Wireless Ad-hoc Sensor Networks Taylor, Christopher J. In this thesis we present LaSLAT, a sensor network algorithm thatsimultaneously localizes sensors, calibrates sensing hardware, andtracks unconstrained moving targets using only range measurementsbetween the sensors and the target. LaSLAT is based on a Bayesian filter, which updates a probabilitydistribution over the quantities of interest as measurementsarrive. The algorithm is distributable, and requires only a constantamount of space with respect to the number of measurementsincorporated. LaSLAT is easy to adapt to new types of hardware and newphysical environments due to its use of intuitive probabilitydistributions: one adaptation demonstrated in this thesis uses amixture measurement model to detect and compensate for bad acousticrange measurements due to echoes.We also present results from a centralized Java implementation ofLaSLAT on both two- and three-dimensional sensor networks in whichranges are obtained using the Cricket ranging system. LaSLAT is ableto localize sensors to within several centimeters of their groundtruth positions while recovering a range measurement bias for eachsensor and the complete trajectory of the mobile.

Empirical Effective Dimension and Optimal Rates for Regularized Least Squares Algorithm

2005年5月27日 00:00:00 GMT

Empirical Effective Dimension and Optimal Rates for Regularized Least Squares Algorithm Caponnetto, Andrea; Rosasco, Lorenzo; Vito, Ernesto De; Verri, Alessandro This paper presents an approach to model selection for regularized least-squares on reproducing kernel Hilbert spaces in the semi-supervised setting. The role of effective dimension was recently shown to be crucial in the definition of a rule for the choice of the regularization parameter, attaining asymptotic optimal performances in a minimax sense. The main goal of the present paper is showing how the effective dimension can be replaced by an empirical counterpart while conserving optimality. The empirical effective dimension can be computed from independent unlabelled samples. This makes the approach particularly appealing in the semi-supervised setting.

Comparing Visual Features for Morphing Based Recognition

2005年5月25日 00:00:00 GMT

Comparing Visual Features for Morphing Based Recognition Wu, Jia Jane This thesis presents a method of object classification using the idea of deformable shape matching. Three types of visual features, geometric blur, C1 and SIFT, are used to generate feature descriptors. These feature descriptors are then used to find point correspondences between pairs of images. Various morphable models are created by small subsets of these correspondences using thin-plate spline. Given these morphs, a simple algorithm, least median of squares (LMEDS), is used to find the best morph. A scoring metric, using both LMEDS and distance transform, is used to classify test images based on a nearest neighbor algorithm. We perform the experiments on the Caltech 101 dataset [5]. To ease computation, for each test image, a shortlist is created containing 10 of the most likely candidates. We were unable to duplicate the performance of [1] in the shortlist stage because we did not use hand-segmentation to extract objects for our training images. However, our gain from the shortlist to correspondence stage is comparable to theirs. In our experiments, we improved from 21% to 28% (gain of 33%), while [1] improved from 41% to 48% (gain of 17%). We find that using a non-shape based approach, C2 [14], the overall classification rate of 33.61% is higher than all of the shaped based methods tested in our experiments.

Lexical Chains and Sliding Locality Windows in Content-based Text Similarity Detection

2005年5月19日 00:00:00 GMT

Lexical Chains and Sliding Locality Windows in Content-based Text Similarity Detection Nahnsen, Thade; Uzuner, Ozlem; Katz, Boris We present a system to determine content similarity of documents. More specifically, our goal is to identify book chapters that are translations of the same original chapter; this task requires identification of not only the different topics in the documents but also the particular flow of these topics. We experiment with different representations employing n-grams of lexical chains and test these representations on a corpus of approximately 1000 chapters gathered from books with multiple parallel translations. Our representations include the cosine similarity of attribute vectors of n-grams of lexical chains, the cosine similarity of tf*idf-weighted keywords, and the cosine similarity of unweighted lexical chains (unigrams of lexical chains) as well as multiplicative combinations of the similarity measures produced by these approaches. Our results identify fourgrams of unordered lexical chains as a particularly useful representation for text similarity evaluation.

Some Properties of Empirical Risk Minimization over Donsker Classes

2005年5月17日 00:00:00 GMT

Some Properties of Empirical Risk Minimization over Donsker Classes Caponnetto, Andrea; Rakhlin, Alexander We study properties of algorithms which minimize (or almost minimize) empirical error over a Donsker class of functions. We show that the L2-diameter of the set of almost-minimizers is converging to zero in probability. Therefore, as the number of samples grows, it is becoming unlikely that adding a point (or a number of points) to the training set will result in a large jump (in L2 distance) to a new hypothesis. We also show that under some conditions the expected errors of the almost-minimizers are becoming close with a rate faster than n^{-1/2}.

A Region-based Architecture for Service-Providing Distributed Systems

2005年5月17日 00:00:00 GMT

A Region-based Architecture for Service-Providing Distributed Systems Singh, Neha A service-providing system consists of hosts that provide services such as data, content, computational and memory resources and data-based services to other entities in the system. Consumers that wish to use services describe their needs with a set of high-level objectives. In this thesis, we address the problem of locating services in a large-scale distributed system using their descriptions, rather than their addresses. We propose a network architecture that is based on the concept of dividing the service-providing hosts into Regions. A Region is a grouping of elements of the network that share a set of common characteristics and policies. Members of a region manage their interactions with other regions and their elements according to some defined rules and policies. Hosts can be divided into regions based on various properties such as their content, their commercial model or their security characteristics to name a few. The service provided by a region is an ! aggregate of the services provided by all its member hosts. The region-based architecture routes a service request through the network efficiently based on its description and on the advertisements from regions providing services. Division of hosts into a set of independent regions partitions the search space and produces a scalable structure. The architecture also does not impose any rules on the internal organization of regions making the system flexible and dynamic.

Risk Bounds for Regularized Least-squares Algorithm with Operator-valued kernels

2005年5月16日 00:00:00 GMT

Risk Bounds for Regularized Least-squares Algorithm with Operator-valued kernels Vito, Ernesto De; Caponnetto, Andrea We show that recent results in [3] on risk bounds for regularized least-squares on reproducing kernel Hilbert spaces can be straightforwardly extended to the vector-valued regression setting. We first briefly introduce central concepts on operator-valued kernels. Then we show how risk bounds can be expressed in terms of a generalization of effective dimension.

Efficient, Verifiable Binary Sandboxing for a CISC Architecture

2005年5月02日 00:00:00 GMT

Efficient, Verifiable Binary Sandboxing for a CISC Architecture McCamant, Stephen; Morrisett, Greg Executing untrusted code while preserving security requiresenforcement of memory and control-flow safety policies:untrusted code must be prevented from modifying memory orexecuting code except as explicitly allowed. Software-basedfault isolation (SFI) or \"sandboxing\" enforces thosepolicies by rewriting the untrusted code at the level ofindividual instructions. However, the original sandboxingtechnique of Wahbe et al. is applicable only to RISCarchitectures, and other previous work is either insecure,or has been not described in enough detail to giveconfidence in its security properties. We present a noveltechnique that allows sandboxing to be easily applied to aCISC architecture like the IA-32. The technique can beverified to have been applied at load time, so that neitherthe rewriting tool nor the compiler needs to be trusted. Wedescribe a prototype implementation which provides a robustsecurity guarantee, is scalable to programs of any size, andhas low runtime overheads. Further, we give amachine-checked proof that any program approved by theverification algorithm is guaranteed to respect the desiredsafety property.

Simultaneous Localization, Calibration, and Tracking in an ad Hoc Sensor Network

2005年4月26日 00:00:00 GMT

Simultaneous Localization, Calibration, and Tracking in an ad Hoc Sensor Network Taylor, Christopher; Rahimi, Ali; Bachrach, Jonathan; Shrobe, Howard We introduce Simultaneous Localization and Tracking (SLAT), the problem of tracking a target in a sensor network while simultaneously localizing and calibrating the nodes of the network. Our proposed solution, LaSLAT, is a Bayesian filter providing on-line probabilistic estimates of sensor locations and target tracks. It does not require globally accessible beacon signals or accurate ranging between the nodes. When applied to a network of 27 sensor nodes, our algorithm can localize the nodes to within one or two centimeters.

Gestural Cues for Sentence Segmentation

2005年4月19日 00:00:00 GMT

Gestural Cues for Sentence Segmentation Eisenstein, Jacob; Davis, Randall In human-human dialogues, face-to-face meetings are often preferred over phone conversations.One explanation is that non-verbal modalities such as gesture provide additionalinformation, making communication more efficient and accurate. If so, computerprocessing of natural language could improve by attending to non-verbal modalitiesas well. We consider the problem of sentence segmentation, using hand-annotatedgesture features to improve recognition. We find that gesture features correlate wellwith sentence boundaries, but that these features improve the overall performance of alanguage-only system only marginally. This finding is in line with previous research onthis topic. We provide a regression analysis, revealing that for sentence boundarydetection, the gestural features are largely redundant with the language model andpause features. This suggests that gestural features can still be useful when speech recognition is inaccurate.

Fast Rates for Regularized Least-squares Algorithm

2005年4月14日 00:00:00 GMT

Fast Rates for Regularized Least-squares Algorithm Caponnetto, Andrea; Vito, Ernesto De We develop a theoretical analysis of generalization performances of regularized least-squares on reproducing kernel Hilbert spaces for supervised learning. We show that the concept of effective dimension of an integral operator plays a central role in the definition of a criterion for the choice of the regularization parameter as a function of the number of samples. In fact, a minimax analysis is performed which shows asymptotic optimality of the above-mentioned criterion.

Learning From Snapshot Examples

2005年4月13日 00:00:00 GMT

Learning From Snapshot Examples Beal, Jacob Examples are a powerful tool for teaching both humans and computers.In order to learn from examples, however, a student must first extractthe examples from its stream of perception. Snapshot learning is ageneral approach to this problem, in which relevant samples ofperception are used as examples. Learning from these examples can inturn improve the judgement of the snapshot mechanism, improving thequality of future examples. One way to implement snapshot learning isthe Top-Cliff heuristic, which identifies relevant samples using ageneralized notion of peaks. I apply snapshot learning with theTop-Cliff heuristic to solve a distributed learning problem and showthat the resulting system learns rapidly and robustly, and canhallucinate useful examples in a perceptual stream from a teacherlesssystem.

De-Emphasis of Distracting Image Regions Using Texture Power Maps

2005年4月12日 00:00:00 GMT

De-Emphasis of Distracting Image Regions Using Texture Power Maps Su, Sara L.; Durand, Fredo; Agrawala, Maneesh A major obstacle in photography is the presence of distracting elements that pull attention away from the main subject and clutter the composition. In this article, we present a new image-processing technique that reduces the salience of distracting regions. It is motivated by computational models of attention that predict that texture variation influences bottom-up attention mechanisms. Our method reduces the spatial variation of texture using power maps, high-order features describing local frequency content in an image. We show how modification of power maps results in powerful image de-emphasis. We validate our results using a user search experiment and eye tracking data.

Construction by robot swarms using extended stigmergy

2005年4月08日 00:00:00 GMT

Construction by robot swarms using extended stigmergy Werfel, Justin; Bar-Yam, Yaneer; Nagpal, Radhika We describe a system in which simple, identical, autonomous robots assemble two-dimensional structures out of identical building blocks. We show that, in a system divided in this way into mobile units and structural units, giving the blocks limited communication abilities enables robots to have sufficient global structural knowledge to rapidly build elaborate pre-designed structures. In this way we extend the principle of stigmergy (storing information in the environment) used by social insects, by increasing the capabilities of the blocks that represent that environmental information. As a result, arbitrary solid structures can be built using a few fixed, local behaviors, without requiring construction to be planned out in detail.

Motion Coordination Using Virtual Nodes

2005年4月06日 00:00:00 GMT

Motion Coordination Using Virtual Nodes Lynch, Nancy; Mitra, Sayan; Nolte, Tina We describe how a virtual node abstraction layer can be used to coordinate the motion of real mobile nodes in a region of 2-space. In particular, we consider how nodes in a mobile ad hoc network can arrange themselves along a predetermined curve in the plane, and can maintain themselves in such a configuration in the presence of changes in the underlying mobile ad hoc network, specifically, when nodes may join or leave the system or may fail. Our strategy is to allow the mobile nodes to implement a virtual layer consisting of mobile client nodes, stationary Virtual Nodes (VNs) for predetermined zones in the plane, and local broadcast communication. The VNs coordinate among themselves to distribute the client nodesbetween zones based on the length of the curve through those zones, while each VN directs its zone's local client nodes to move themselves to equally spaced locations on the local portion of the target curve.

On Relational Analysis of Algebraic Datatypes

2005年4月05日 00:00:00 GMT

On Relational Analysis of Algebraic Datatypes Kuncak, Viktor; Jackson, Daniel We present a technique that enables the use of finite modelfinding to check the satisfiability of certain formulaswhose intended models are infinite. Such formulas arisewhen using the language of sets and relations to reasonabout structured values such as algebraic datatypes. Thekey idea of our technique is to identify a natural syntacticclass of formulas in relational logic for which reasoningabout infinite structures can be reduced to reasoning aboutfinite structures. As a result, when a formula belongs tothis class, we can use existing finite model findingtools to check whether the formula holds in the desiredinfinite model.

Wait-free Regular Storage from Byzantine Components

2005年4月05日 00:00:00 GMT

Wait-free Regular Storage from Byzantine Components Abraham, Ittai; Chockler, Gregory; Keidar, Idit; Malkhi, Dahlia We present a simple, efficient, and self-contained construction of a wait-free regular register from Byzantine storage components. Our construction utilizes a novel building block, called 1-regular register, which can be implemented from Byzantine fault-prone components with the same round complexity as a safe register, and with only a slight increase in storage space.

An Expectation Maximization Approach for Integrated Registration, Segmentation, and Intensity Correction

2005年4月01日 00:00:00 GMT

An Expectation Maximization Approach for Integrated Registration, Segmentation, and Intensity Correction Pohl, Kilian M.; Fisher, John; Grimson, W. Eric L.; Wells, William M. This paper presents a statistical framework which combines the registration of an atlas with the segmentation of MR images. We use an Expectation Maximization-based algorithm to find a solution within the model, which simultaneously estimates image inhomogeneities, anatomical labelmap, and a mapping from the atlas to the image space. An example of the approach is given for a brain structure-dependent affine mapping approach. The algorithm produces high quality segmentations for brain tissues as well as their substructures. We demonstrate the approach on a set of 30 brain MR images. In addition, we show that the approach performs better than similar methods which separate the registration from the segmentation problem.

Combining Variable Selection with Dimensionality Reduction

2005年3月30日 00:00:00 GMT

Combining Variable Selection with Dimensionality Reduction Wolf, Lior; Bileschi, Stanley This paper bridges the gap between variable selection methods (e.g., Pearson coefficients, KS test) and dimensionality reductionalgorithms (e.g., PCA, LDA). Variable selection algorithms encounter difficulties dealing with highly correlated data,since many features are similar in quality. Dimensionality reduction algorithms tend to combine all variables and cannotselect a subset of significant variables.Our approach combines both methodologies by applying variable selection followed by dimensionality reduction. Thiscombination makes sense only when using the same utility function in both stages, which we do. The resulting algorithmbenefits from complex features as variable selection algorithms do, and at the same time enjoys the benefits of dimensionalityreduction.1

Matrix Approximation and Projective Clustering via Iterative Sampling

2005年3月29日 00:00:00 GMT

Matrix Approximation and Projective Clustering via Iterative Sampling Rademacher, Luis; Vempala, Santosh; Wang, Grant We present two new results for the problem of approximating a given real m by n matrix A by a rank-k matrix D, where k < min{m, n}, so as to minimize ||A-D||_F^2. It is known that bysampling O(k/eps) rows of the matrix, one can find a low-rank approximation with additive error eps||A||_F^2. Our first result shows that with adaptive sampling in t rounds and O(k/eps) samples in each round, the additive error drops exponentially as eps^t; the computation time is nearly linear in the number of nonzero entries. This demonstrates that multiple passes can be highly beneficial for a natural (and widely studied) algorithmic problem. Our second result is that there exists a subset of O(k^2/eps) rows such that their span contains a rank-k approximation with multiplicative (1+eps) error (i.e., the sum of squares distance has a small \"core-set\" whose span determines a good approximation). This existence theorem leads to a PTAS for the following projective clustering probl! em: Given a set of points P in R^d, and integers k,j, find a set of j subspaces F_1,...,F_j, each of dimension at most k, that minimize \\sum_{p \\in P} min_i d(p,F_i)^2.

Combining Object and Feature Dynamics in Probabilistic Tracking

2005年3月02日 00:00:00 GMT

Combining Object and Feature Dynamics in Probabilistic Tracking Taycher, Leonid; Fisher III, John W.; Darrell, Trevor Objects can exhibit different dynamics at different scales, a property that isoftenexploited by visual tracking algorithms. A local dynamicmodel is typically used to extract image features that are then used as inputsto a system for tracking the entire object using a global dynamic model.Approximate local dynamicsmay be brittle---point trackers drift due to image noise and adaptivebackground models adapt to foreground objects that becomestationary---but constraints from the global model can make them more robust.We propose a probabilistic framework for incorporating globaldynamics knowledge into the local feature extraction processes.A global tracking algorithm can beformulated as a generative model and used to predict feature values thatinfluence the observation process of thefeature extractor. We combine such models in a multichain graphicalmodel framework.We show the utility of our framework for improving feature tracking and thusshapeand motion estimates in a batch factorization algorithm.We also propose an approximate filtering algorithm appropriate for onlineapplications, and demonstrate its application to problems such as backgroundsubtraction, structure from motion and articulated body tracking.

Receptive field structures for recognition

2005年3月01日 00:00:00 GMT

Receptive field structures for recognition Balas, Benjamin; Sinha, Pawan Localized operators, like Gabor wavelets and difference-of-Gaussian filters, are considered to be useful tools for image representation. This is due to their ability to form a Â‘sparse codeÂ’ that can serve as a basis set for high-fidelity reconstruction of natural images. However, for many visual tasks, the more appropriate criterion of representational efficacy is Â‘recognitionÂ’, rather than Â‘reconstructionÂ’. It is unclear whether simple local features provide the stability necessary to subserve robust recognition of complex objects. In this paper, we search the space of two-lobed differential operators for those that constitute a good representational code under recognition/discrimination criteria. We find that a novel operator, which we call the Â‘dissociated dipoleÂ’ displays useful properties in this regard. We describe simple computational experiments to assess the merits of such dipoles relative to the more traditional local operators. The results suggest that non-local operators constitute a vocabulary that is stable across a range of image transformations.

File Synchronization with Vector Time Pairs

2005年2月28日 00:00:00 GMT

File Synchronization with Vector Time Pairs Cox, Russ; Josephson, William Vector time pairs are a new method for trackingsynchronization metadata. A vector time pairconsists of two vector times: one tracking filemodification history and one tracking filesynchronization history. Because the vectortimes are maintained separately and used fordifferent purposes, different algorithms andoptimizations can be applied to each. As aresult, vector time pairs impose no restrictionon synchronization patterns, never falsely detectconflicts, require no space to store deletionnotices, require network bandwidth proportionalonly to the number of files changed, and supportpartial synchronizations. No other currentsynchronization method has all these properties.Results from an implementation of vector timepairs in a new user-level file synchronizercalled Tra confirm the benefits of vectortime pairs.

Impossibility of boosting distributed service resilience

2005年2月25日 00:00:00 GMT

Impossibility of boosting distributed service resilience Attie, Paul; Guerraoui, Rachid; Kouznetsov, Petr; Lynch, Nancy; Rajsbaum, Sergio We prove two theorems saying that no distributed system in whichprocesses coordinate using reliable registers and f-resilient servicescan solve the consensus problem in the presence of f+1 undetectableprocess stopping failures. (A service is f-resilient if it isguaranteed to operate as long as no more than f of the processesconnected to it fail.)Our first theorem assumes that the given services are atomic objects,and allows any connection pattern between processes and services. Incontrast, we show that it is possible to boost the resilience ofsystems solving problems easier than consensus: the k-set consensusproblem is solvable for 2k-1 failures using 1-resilient consensusservices. The first theorem and its proof generalize to the largerclass of failure-oblivious services.Our second theorem allows the system to contain failure-awareservices, such as failure detectors, in addition to failure-obliviousservices; however, it requires that each failure-aware service beconnected to all processes. Thus, f+1 process failures overall candisable all the failure-aware services. In contrast, it is possibleto boost the resilience of a system solving consensus if arbitrarypatterns of connectivity are allowed between processes andfailure-aware services: consensus is solvable for any number offailures using only 1-resilient 2-process perfect failure detectors.

Discovering object categories in image collections

2005年2月25日 00:00:00 GMT

Discovering object categories in image collections Sivic, Josef; Russell, Bryan C.; Efros, Alexei A.; Zisserman, Andrew; Freeman, William T. Given a set of images containing multiple object categories,we seek to discover those categories and their image locations withoutsupervision. We achieve this using generative modelsfrom the statistical text literature: probabilistic Latent SemanticAnalysis (pLSA), and Latent Dirichlet Allocation (LDA). In text analysisthese are used to discover topics in a corpus using the bag-of-wordsdocument representation. Here we discover topics as object categories, sothat an image containing instances of several categories is modelled as amixture of topics.The models are applied to images by using avisual analogue of a word, formed by vector quantizing SIFT like regiondescriptors. We investigate a set of increasingly demanding scenarios,starting with image sets containing only two object categories through tosets containing multiple categories (including airplanes, cars, faces,motorbikes, spotted cats) and background clutter. The object categoriessample both intra-class and scale variation, and both the categories andtheir approximate spatial layout are found without supervision.We also demonstrate classification of unseen images and images containingmultiple objects. Performance of the proposed unsupervised method is compared tothe semi-supervised approach of Fergus et al.

Improving 802.11 Range with Forward Error Correction

2005年2月24日 00:00:00 GMT

Improving 802.11 Range with Forward Error Correction Riemann, Reina; Winstein, Keith The ISO/IEC 8802-11:1999(E) specification uses a 32-bit CRC for error detection and whole-packet retransmissions for recovery. In long-distance orhigh-interference links where the probability of a bit error is high,this strategy results in excessive losses, because any erroneous bitcauses an entire packet to be discarded. By ignoring the CRC andadding redundancy to 802.11 payloads in software, we achievedsubstantially reduced loss rates on indoor and outdoor long-distancelinks and extended line-of-sight range outdoors by 70 percent.

Complexity of finding Nash equilibria in 0-1 bimatrix games

2005年2月08日 00:00:00 GMT

Complexity of finding Nash equilibria in 0-1 bimatrix games Abbott, Tim; Kane, Daniel; Valiant, Paul We exhibit a polynomial reduction from the problem of finding a Nashequilibrium of a bimatrix game with rational coefficients to the problemof finding a Nash equilibrium of a bimatrix game with 0-1 coefficients.

Stable Policy Routing with Provider Independence

2005年2月08日 00:00:00 GMT

Stable Policy Routing with Provider Independence Feamster, Nick; Johari, Ramesh; Balakrishnan, Hari Thousands of competing autonomous systems (ASes) mustcooperate with each other to provide global Internet connectivity.These ASes encode various economic, business,and performance decisions in their routing policies. The currentinterdomain routing system enables ASes to express policyusing rankings that determine how each router in an ASorders the different routes to a destination, and filters thatdetermine which routes are hidden from each neighboringAS. Since the Internet is composed of many independent,competing networks, the interdomain routing system shouldallow providers to set their rankings independently, and tohave no constraints on allowed filters. This paper studiesrouting protocol stability under these constraints. We firstdemonstrate that certain rankings that are commonly usedin practice may not ensure routing stability. We then provethat, with ranking independence and unrestricted filtering,guaranteeing that the routing system will converge to a stablepath assignment essentially requires ASes to rank routesbased on AS-path lengths. Finally, we discuss the implicationsof these results for the future of interdomain routing.

Using computational models to study texture representations in the human visual system.

2005年2月07日 00:00:00 GMT

Using computational models to study texture representations in the human visual system. Balas, Benjamin Traditionally, human texture perception has been studied using artificial textures made of random-dot patterns or abstract structured elements. At the same time, computer algorithms for the synthesis of natural textures have improved dramatically. The current study seeks to unify these two fields of research through a psychophysical assessment of a particular computational model, thus providing a sense of what image statistics are most vital for representing a range of natural textures. We employ Portilla and SimoncelliÂ’s 2000 model of texture synthesis for this task (a parametric model of analysis and synthesis designed to mimic computations carried out by the human visual system). We find an intriguing interaction between texture type (periodic v. structured) and image statistics (autocorrelation function and filter magnitude correlations), suggesting different processing strategies may be employed for these two texture families under pre-attentive viewing.

Functional Differential Geometry

2005年2月02日 00:00:00 GMT

Functional Differential Geometry Sussman, Gerald Jay; Wisdom, Jack Differential geometry is deceptively simple. It is surprisingly easyto get the right answer with unclear and informal symbol manipulation.To address this problem we use computer programs to communicate aprecise understanding of the computations in differential geometry.Expressing the methods of differential geometry in a computer languageforces them to be unambiguous and computationally effective. The taskof formulating a method as a computer-executable program and debuggingthat program is a powerful exercise in the learning process. Also,once formalized procedurally, a mathematical idea becomes a tool thatcan be used directly to compute results.

The Security Power of the Ballot Box

2005年2月02日 00:00:00 GMT

The Security Power of the Ballot Box Lepinski, Matt; Izmalkov, Sergei We show that any function F can be securely evaluated by a protocolwith ballots and a ballot box. That is, N mutually suspicious players,each player possessing a secret input, can use ballots and a ballotbox to jointly evaluate F on their secret inputs so that (no matterhow many players may collude and deviate from their prescribed instructions, and no matter how long they compute!) each player learnsexactly the output of the function with the same privacy and correctnessas if all players privately handed their secret inputs to a trustedparty, who privately evaluates F and privately returns the outputs toeach player.Our protocol is (1) efficient, (2) enjoys perfect privacy, (3) guarantees perfect correctness, (4) is universally composable, and (5)is collusion-free even for games with secret actions.

Determining articulator configuration in voiced stop consonants by matching time-domain patterns in pitch periods

2005年1月28日 00:00:00 GMT

Determining articulator configuration in voiced stop consonants by matching time-domain patterns in pitch periods Kondacs, Attila In this thesis I will be concerned with linking the observed speechsignal to the configuration of articulators.Due to the potentially rapid motion of the articulators, the speechsignal can be highly non-stationary. The typical linear analysistechniques that assume quasi-stationarity may not have sufficienttime-frequency resolution to determine the place of articulation.I argue that the traditional low and high-level primitives of speechprocessing, frequency and phonemes, are inadequate and should bereplaced by a representation with three layers: 1. short pitch periodresonances and other spatio-temporal patterns 2. articulatorconfiguration trajectories 3. syllables. The patterns indicatearticulator configuration trajectories (how the tongue, jaws, etc. aremoving), which are interpreted as syllables and words.My patterns are an alternative to frequency. I use shorttime-domain features of the sound waveform, which can be extractedfrom each vowel pitch period pattern, to identify the positions of thearticulators with high reliability. These features are importantbecause by capitalizing on detailed measurements within a single pitchperiod, the rapid articulator movements can be tracked. No linearsignal processing approach can achieve the combination of sensitivityto short term changes and measurement accuracy resulting from thesenonlinear techniques.The measurements I use are neurophysiologically plausible: theauditory system could be using similar methods.I have demonstrated this approach by constructing a robust techniquefor categorizing the English voiced stops as the consonants B, D, or Gbased on the vocalic portions of their releases. The classificationrecognizes 93.5%, 81.8% and 86.1% of the b, d and gto ae transitions with false positive rates 2.9%, 8.7% and2.6% respectively.

Virtual Stationary Automata for Mobile Networks

2005年1月21日 00:00:00 GMT

Virtual Stationary Automata for Mobile Networks Dolev, Shlomi; Gilbert, Seth; Lahiani, Limor; Lynch, Nancy; Nolte, Tina We define a programming abstraction formobile networks called the Virtual Stationary Automataprogramming layer, consisting of real mobile clients, virtualtimed I/O automata called virtual stationary automata(VSAs), and a communication service connecting VSAs andclient nodes. The VSAs are located at prespecified regionsthat tile the plane, defining a static virtual infrastructure.We present a self-stabilizing algorithm to emulate a VSAusing the real mobile nodes that are currently residingin the VSAÂ’s region. We also describe several examplesof applications whose implementations benefit from thesimplicity obtained through use of the VSA abstraction.

Biologically-Inspired Robust Spatial Programming

2005年1月18日 00:00:00 GMT

Biologically-Inspired Robust Spatial Programming Beal, Jacob; Sussman, Gerald Inspired by the robustness and flexibility of biological systems, we are developing linguistic and programming tools to allow us to program spatial systems populated by vast numbers of unreliable components interconnected in unknown, irregular, and time-varying ways. We organize our computations around geometry, making the fact that our system is made up of discrete individuals implicit. Geometry allows us to specify requirements in terms of the behavior of the space occupied by the aggregate rather than the behavior of individuals, thereby decreasing complexity. So we describe the behavior of space explicitly, abstracting away the discrete nature of the components. As an example, we present the Amorphous Medium Language, which describes behavior in terms of homeostatic maintenance of constraints on nested regions of space.

How Much of a Hypertree can be Captured by Windmills?

2005年1月03日 00:00:00 GMT

How Much of a Hypertree can be Captured by Windmills? Liang, Percy; Srebro, Nati Current approximation algorithms for maximum weight {\em hypertrees} find heavy {\em windmill farms}, and are based on the fact that a constant ratio (for constant width $k$) of the weight of a $k$-hypertree can be captured by a $k$-windmill farm. However, the exact worst case ratio is not known and is only bounded to be between 1ドル/(k+1)!$ and 1ドル/(k+1)$. We investigate this worst case ratio by searching for weighted hypertrees that minimize the ratio of their weight that can be captured with a windmill farm. To do so, we use a novel approach in which a linear program is used to find ``bad'' inputs to a dynamic program.

A Dynamic Data Structure for Checking Hyperacyclicity

2005年1月03日 00:00:00 GMT

A Dynamic Data Structure for Checking Hyperacyclicity Liang, Percy; Srebro, Nati We present a dynamic data structure that keeps track of an acyclic hypergraph (equivalently, a triangulated graph) and enables verifying that adding a candidate hyperedge (clique) will not break the acyclicity of the augmented hypergraph. This is a generalization of the use of Tarjan's Union-Find data structure for maintaining acyclicity when augmenting forests, and the amortized time per operation has a similar almost-constant dependence on the size of the hypergraph. Such a data structure is useful when augmenting acyclic hypergraphs, e.g.\~in order to greedily construct a high-weight acyclic hypergraph. In designing this data structure, we introduce a hierarchical decomposition of acyclic hypergraphs that aid in understanding {\em hyper-connectivity}, and introduce a novel concept of a {\em hypercycle} which is excluded from acyclic hypergraphs.

Neural Voting Machines

2004年12月31日 00:00:00 GMT

Neural Voting Machines Richards, Whitman; Seung, H. Sebastian Â“Winner-take-allÂ” networks typically pick as winners that alternative with the largest excitatory input. This choice is far from optimal when there is uncertainty in the strength of the inputs, and when information is available about how alternatives may be related. In the Social Choice community, many other procedures will yield more robust winners. The Borda Count and the pair-wise Condorcet tally are among the most favored. Their implementations are simple modifications of classical recurrent networks.

A general mechanism for tuning: Gain control circuits and synapses underlie tuning of cortical neurons

2004年12月31日 00:00:00 GMT

A general mechanism for tuning: Gain control circuits and synapses underlie tuning of cortical neurons Kouh, Minjoon; Poggio, Tomaso Tuning to an optimal stimulus is a widespread property of neurons in cortex. We propose that such tuning is a consequence of normalization or gain control circuits. We also present a biologically plausible neural circuitry of tuning.

Methods and Experiments With Bounded Tree-width Markov Networks

2004年12月30日 00:00:00 GMT

Methods and Experiments With Bounded Tree-width Markov Networks Liang, Percy; Srebro, Nathan Markov trees generalize naturally to bounded tree-width Markov networks, onwhich exact computations can still be done efficiently. However, learning themaximum likelihood Markov network with tree-width greater than 1 is NP-hard, sowe discuss a few algorithms for approximating the optimal Markov network. Wepresent a set of methods for training a density estimator. Each method isspecified by three arguments: tree-width, model scoring metric (maximumlikelihood or minimum description length), and model representation (using onejoint distribution or several class-conditional distributions). On thesemethods, we give empirical results on density estimation and classificationtasks and explore the implications of these arguments.

Machine-Checkable Correctness Proofs forIntra-procedural Dataflow Analyses

2004年12月16日 00:00:00 GMT

Machine-Checkable Correctness Proofs forIntra-procedural Dataflow Analyses Salcianu, Alexandru; Arkoudas, Konstantine This technical report describes our experience using the interactive theorem proverAthena for proving the correctness of abstract interpretation-based dataflow analyses.For each analysis, our methodology requires the analysis designer to formallyspecify the property lattice, the transfer functions, and the desired modeling relationbetween the concrete program states and the results computed by the analysis. Thegoal of the correctness proof is to prove that the desired modeling relation holds.The proof allows the analysis clients to rely on the modeling relation for their owncorrectness. To reduce the complexity of the proofs, we separate the proof of eachdataflow analysis into two parts: a generic part, proven once, independent of anyspecific analysis; and several analysis-specific conditions proven in Athena.

On Decision Procedures for Set-Value Fields

2004年11月30日 00:00:00 GMT

On Decision Procedures for Set-Value Fields Kuncak, Viktor; Rinard, Martin An important feature of object-oriented programming languages is the ability todynamically instantiate user-defined container data structures such as lists, trees,and hash tables. Programs implement such data structures using references todynamically allocated objects, which allows data structures to store unboundednumbers of objects, but makes reasoning about programs more difficult. Reasoningabout object-oriented programs with complex data structures is simplified if datastructure operations are specified in terms of abstract sets of objects associatedwith each data structure. For example, an insertion into a data structure in thisapproach becomes simply an insertion into a dynamically changing set-valued fieldof an object, as opposed to a manipulation of a dynamically linked structure linkedto the object.In this paper we explore reasoning techniques for programs that manipulate datastructures specified using set-valued abstract fields associated with container objects.We compare the expressive power and the complexity of specification languagesbased on 1) decidable prefix vocabulary classes of first-order logic, 2) twovariablelogic with counting, and 3) Nelson-Oppen combinations of multisortedtheories. Such specification logics can be used for verification of object-orientedprograms with supplied invariants. Moreover, by selecting an appropriate subsetof properties expressible in such logic, the decision procedures for these logics yieldautomated computation of lattice operations in abstract interpretation domain, aswell as automated computation of abstract program semantics.

Comparing Network Coding with Multicommodity Flow for the k-pairs Communication Problem

2004年11月24日 00:00:00 GMT

Comparing Network Coding with Multicommodity Flow for the k-pairs Communication Problem Harvey, Nicholas J.; Kleinberg, Robert D.; Lehman, April Rasala Given a graph G = (V,E) and k source-sink pairs of vertices, this papers investigates the maximum rate r at which all pairs can simultaneously communicate. We view this problem from two perspectives and compare their advantages. In the multicommodity flow formulation, a solution provides dedicated bandwidth r between each source-sink pair. In the information flow formulation, a vertex can transmit a function of the information it received thereby allowing multiple source-sink pairs to share bandwidth. For directed acyclic graphs with n vertices, we show that the rate achievable in the information flow formulation can be a multiplicative factor n larger than the rate achievable in the multicommodity flow formulation. It is well known [5] that for undirected graphs with n vertices, in the multicommodity flow formulation, the maximum rate achievable can be an O(1/log|V|) multiplicative factor smaller than the value of the sparsest cut. We extend this result to show that the maximum rate achievable in the information flow setting can be an O(1/log|V|) multiplicative factor smaller than the sparsest cut value.For directed acyclic graphs G, we define a parameter called the value of the most meager cut which is an upper bound for the maximum rate achievable in the information flow setting.We also present an example illustrating that this upper bound is not always tight.

Learning with Matrix Factorizations

2004年11月22日 00:00:00 GMT

Learning with Matrix Factorizations Srebro, Nathan Matrices that can be factored into a product of two simpler matricescan serve as a useful and often natural model in the analysis oftabulated or high-dimensional data. Models based on matrixfactorization (Factor Analysis, PCA) have been extensively used instatistical analysis and machine learning for over a century, withmany new formulations and models suggested in recent years (LatentSemantic Indexing, Aspect Models, Probabilistic PCA, Exponential PCA,Non-Negative Matrix Factorization and others). In this thesis weaddress several issues related to learning with matrix factorizations:we study the asymptotic behavior and generalization ability ofexisting methods, suggest new optimization methods, and present anovel maximum-margin high-dimensional matrix factorizationformulation.

Availability-Consistency Trade-Offs in a Fault-Tolerant Stream Processing System

2004年11月22日 00:00:00 GMT

Availability-Consistency Trade-Offs in a Fault-Tolerant Stream Processing System Balazinska, Magdalena; Balakrishnan, Hari; Madden, Samuel; Stonebraker, Mike processing. In contrast to previous techniques that handlenode failures, our approach also tolerates network failuresand network partitions. The approach is based on a principledtrade-off between consistency and availability in theface of failure, that (1) ensures that all data on an inputstream is processed within a specified time threshold, but(2) reduces the impact of failures by limiting if possible thenumber of results produced based on partially available inputdata, and (3) corrects these results when failures heal.Our approach is well-suited for applications such as environmentmonitoring, where high availability and Â“real-timeÂ”response is preferable to perfect answers.Our approach uses replication and guarantees that all processingreplicas achieve state consistency, both in the absenceof failures and after a failure heals. We achieve consistencyin the former case by defining a data-serializing operatorthat ensures that the order of tuples to a downstreamoperator is the same at all the replicas. To achieve consistencyafter a failure heals, we develop approaches based oncheckpoint/redo and undo/redo techniques.We have implemented these schemes in a prototype distributedstream processing system, and present experimentalresults that show that the system meets the desiredavailability-consistency trade-offs.

Efficient Image Matching with Distributions of Local Invariant Features

2004年11月22日 00:00:00 GMT

Efficient Image Matching with Distributions of Local Invariant Features Grauman, Kristen; Darrell, Trevor Sets of local features that are invariant to common image transformations are an effective representation to use when comparing images; current methods typically judge feature sets' similarity via a voting scheme (which ignores co-occurrence statistics) or by comparing histograms over a set of prototypes (which must be found by clustering). We present a method for efficiently comparing images based on their discrete distributions (bags) of distinctive local invariant features, without clustering descriptors. Similarity between images is measured with an approximation of the Earth Mover's Distance (EMD), which quickly computes the minimal-cost correspondence between two bags of features. Each image's feature distribution is mapped into a normed space with a low-distortion embedding of EMD. Examples most similar to a novel query image are retrieved in time sublinear in the number of examples via approximate nearest neighbor search in the embedded space. We also show how the feature representation may be extended to encode the distribution of geometric constraints between the invariant features appearing in each image.We evaluate our technique with scene recognition and texture classification tasks.

A new biologically motivated framework for robust object recognition

2004年11月14日 00:00:00 GMT

A new biologically motivated framework for robust object recognition Serre, Thomas; Wolf, Lior; Poggio, Tomaso In this paper, we introduce a novel set of features for robust object recognition, which exhibits outstanding performances on a variety ofobject categories while being capable of learning from only a fewtraining examples. Each element of this set is a complex featureobtained by combining position- and scale-tolerant edge-detectors overneighboring positions and multiple orientations.Our system - motivated by a quantitative model of visual cortex -outperforms state-of-the-art systems on a variety of object imagedatasets from different groups. We also show that our system is ableto learn from very few examples with no prior category knowledge. Thesuccess of the approach is also a suggestive plausibility proof for aclass of feed-forward models of object recognition in cortex. Finally,we conjecture the existence of a universal overcompletedictionary of features that could handle the recognition of all objectcategories.

Capacity Allocation in Wireless LANs

2004年11月12日 00:00:00 GMT

Capacity Allocation in Wireless LANs Tan, Godfrey; Guttag, John Today's access point based wireless LANs (WLANs) are inefficient and unfair. For many traffic loads they provide far less total throughput than they should, and do a poor job allocating what throughput they do deliver. Inappropriate association of nodes to access points and rates to flows plays a large role in these problems. We address a major root cause of this problem in this paper.Current practice ignores the distinction between flows that connect two wireless nodes via an access point and flows that connect wireless nodes to the wired infrastructure. As wireless devices and applications become more pervasive, ignoring this distinction will lead to a significant degradation in perceived performance.In this paper, we i) describe a series of examples that illustrates the impact of two-hop flows on the performance of the system, ii) provide a practical algorithm to solve the AP-assignment problem and iii) evaluate the performance of our algorithm against other approaches. Our preliminary results show that our algorithm can increase average achieved throughput by as much as 50% for some traffic loads.

Regularization Through Feature Knock Out

2004年11月12日 00:00:00 GMT

Regularization Through Feature Knock Out Wolf, Lior; Martin, Ian In this paper, we present and analyze a novel regularization technique based on enhancing our dataset with corrupted copies of the original data. The motivation is that since the learning algorithm lacks information about which parts of thedata are reliable, it has to produce more robust classification functions. We then demonstrate how this regularization leads to redundancy in the resulting classifiers, which is somewhat in contrast to the common interpretations of the OccamÂ’s razor principle. Using this framework, we propose a simple addition to the gentle boosting algorithm which enables it to work with only a few examples. We test this new algorithm on a variety of datasets and show convincing results.

Shape Representation in V4: Investigating Position-Specific Tuning for Boundary Conformation with the Standard Model of Object Recognition

2004年11月12日 00:00:00 GMT

Shape Representation in V4: Investigating Position-Specific Tuning for Boundary Conformation with the Standard Model of Object Recognition Cadieu, Charles; Kouh, Minjoon; Riesenhuber, Maximilian; Poggio, Tomaso The computational processes in the intermediate stages of the ventral pathway responsible for visual object recognition are not well understood. A recent physiological study by A. Pasupathy and C. Connor in intermediate area V4 using contour stimuli, proposes that a population of V4 neurons display bjectcentered,position-specific curvature tuning [18]. The Â“standard modelÂ” of object recognition, a recently developed model [23] to account for recognition properties of IT cells (extending classical suggestions by Hubel, Wiesel and others [9, 10, 19]), is used here to model the response of the V4 cells described in [18]. Our results show that a feedforward, network level mechanism can exhibit selectivity and invariance properties that correspond to the responses of the V4 cells described in [18]. These results suggest howobject-centered, position-specific curvature tuning of V4 cells may arise from combinations of complex V1 cell responses. Furthermore, the model makes predictions about the responses of the same V4 cells studied by Pasupathy and Connor to novel gray level patterns, such as gratings and natural images. Thesepredictions suggest specific experiments to further explore shape representation in V4.

Neural Network Models for Zebra Finch Song Production and Reinforcement Learning

2004年11月09日 00:00:00 GMT

Neural Network Models for Zebra Finch Song Production and Reinforcement Learning Werfel, Justin The zebra finch is a standard experimental system for studying learning and generation of temporally extended motor patterns. The first part of this project concerned the evaluation of simple models for the operation and structure of the network in the motor nucleus RA. A directed excitatory chain with a global inhibitory network, for which experimental evidence exists, was found to produce waves of activity similar to those observed in RA; this similarity included one particularly important feature of the measured activity, synchrony between the onset of bursting in one neuron and the offset of bursting in another. Other models, which were simpler and more analytically tractable, were also able to exhibit this feature, but not for parameter values quantitatively close to those observed.Another issue of interest concerns how these networks are initially learned by the bird during song acquisition. The second part of the project concerned the analysis of exemplars of REINFORCE algorithms, a general class of algorithms for reinforcement learning in neural networks, which are on several counts more biologically plausible than standard prescriptions such as backpropagation. The former compared favorably with backpropagation on tasks involving single input-output pairs, though a noise analysis suggested it should not perform so well. On tasks involving trajectory learning, REINFORCE algorithms meet with some success, though the analysis that predicts their success on input-output-pair tasks fails to explain it for trajectories.

Managing the 802.11 Energy/Performance Tradeoff with Machine Learning

2004年10月27日 00:00:00 GMT

Managing the 802.11 Energy/Performance Tradeoff with Machine Learning Monteleoni, Claire; Balakrishnan, Hari; Feamster, Nick; Jaakkola, Tommi This paper addresses the problem of managing the tradeoff betweenenergy consumption and performance in wireless devices implementingthe IEEE 802.11 standard. To save energy, the 802.11 specificationproposes a power-saving mode (PSM), where a device can sleep to saveenergy, periodically waking up to receive packets from a neighbor(e.g., an access point) that may have buffered packets for thesleeping device. Previous work has shown that a fixed polling time forwaking up degrades the performance of Web transfers, because networkactivity is bursty and time-varying. We apply a new online machinelearning algorithm to this problem and show, using ns simulation andtrace analysis, that it is able to adapt well to network activity. Thelearning process makes no assumptions about the underlying networkactivity being stationary or even Markov. Our learning power-savingalgorithm, LPSM, guides the learning using a "loss function" thatcombines the increased latency from potentially sleeping too long andthe wasted use of energy in waking up too soon. In our nssimulations, LPSM saved 7%-20% more energy than 802.11 in power-savingmode, with an associated increase in average latency by a factor of1.02, and not more than 1.2. LPSM is straightforward to implementwithin the 802.11 PSM framework.

On Spatial Conjunction as Second-Order Logic

2004年10月25日 00:00:00 GMT

On Spatial Conjunction as Second-Order Logic Kuncak, Viktor; Rinard, Martin Spatial conjunction is a powerful construct for reasoning about dynamically allocateddata structures, as well as concurrent, distributed and mobile computation. Whileresearchers have identified many uses of spatial conjunction, its precise expressive powercompared to traditional logical constructs was not previously known.In this paper we establish the expressive power of spatial conjunction. We construct anembedding from first-order logic with spatial conjunction into second-order logic, and moresurprisingly, an embedding from full second order logic into first-order logic with spatialconjunction. These embeddings show that the satisfiability of formulas in first-order logicwith spatial conjunction is equivalent to the satisfiability of formulas in second-order logic.These results explain the great expressive power of spatial conjunction and can be usedto show that adding unrestricted spatial conjunction to a decidable logic leads to an undecidablelogic. As one example, we show that adding unrestricted spatial conjunction totwo-variable logic leads to undecidability.On the side of decidability, the embedding into second-order logic immediately implies thedecidability of first-order logic with a form of spatial conjunction over trees. The embeddinginto spatial conjunction also has useful consequences: because a restricted form of spatialconjunction in two-variable logic preserves decidability, we obtain that a correspondinglyrestricted form of second-order quantification in two-variable logic is decidable. The resultinglanguage generalizes the first-order theory of boolean algebra over sets and is useful inreasoning about the contents of data structures in object-oriented languages.

Botz-4-Sale: Surviving Organized DDoS Attacks that Mimic Flash Crowds

2004年10月22日 00:00:00 GMT

Botz-4-Sale: Surviving Organized DDoS Attacks that Mimic Flash Crowds Kandula, Srikanth; Katabi, Dina; Jacob, Matthias; Berger, Arthur Recent denial of service attacks are mounted by professionalsusing Botnets of tens of thousands of compromisedmachines. To circumvent detection, attackers areincreasingly moving away from pure bandwidth oods toattacks that mimic the Web browsing behavior of a largenumber of clients, and target expensive higher-layer resourcessuch as CPU, database and disk bandwidth. Theresulting attacks are hard to defend against using standardtechniques as the malicious requests differ from thelegitimate ones in intent but not in content.We present the design and implementation of Kill-Bots, a kernel extension to protect Web servers againstDDoS attacks that masquerade as ash crowds. Kill-Botsprovides authentication using graphical tests but is differentfrom other systems that use graphical tests. First,instead of authenticating clients based on whether theysolve the graphical test, Kill-Bots uses the test to quicklyidentify the IP addresses of the attack machines. Thisallows it to block the malicious requests while allowingaccess to legitimate users who are unable or unwillingto solve graphical tests. Second, Kill-Bots sends a testand checks the client's answer without allowing unauthenticatedclients access to sockets, TCBs, worker processes,etc. This protects the authentication mechanismfrom being DDoSed. Third, Kill-Bots combines authenticationwith admission control. As a result, it improvesperformance, regardless of whether the server overloadis caused by DDoS or a true Flash Crowd. We have implementedKill-Bots in the Linux kernel and evaluated itin the wide-area Internet using PlanetLab.

Combining dynamic abstractions in large MDPs

2004年10月21日 00:00:00 GMT

Combining dynamic abstractions in large MDPs Steinkraus, Kurt; Kaelbling, Leslie Pack One of the reasons that it is difficult to plan and act in real-worlddomains is that they are very large. Existing research generallydeals with the large domain size using a static representation andexploiting a single type of domain structure. In this paper, wecreate a framework that encapsulates existing and new abstraction andapproximation methods into modules, and combines arbitrary modulesinto a system that allows for dynamic representation changes. We showthat the dynamic changes of representation allow our framework tosolve larger and more interesting domains than were previouslypossible, and while there are no optimality guarantees, suitablemodule choices gain tractability at little cost to optimality.

NIRA: A New Internet Routing Architecture

2004年10月14日 00:00:00 GMT

NIRA: A New Internet Routing Architecture Yang, Xiaowei The present Internet routing system faces two challengingproblems. First, unlike in the telephone system, Internet users cannotchoose their wide-area Internet service providers (ISPs) separatelyfrom their local access providers. With the introduction of newtechnologies such as broadband residential service andfiber-to-the-home, the local ISP market is often a monopoly or aduopoly. The lack of user choice is likely to reduce competition amongwide-area ISPs, limiting the incentives for wide-area ISPs to improvequality of service, reduce price, and offer new services. Second, thepresent routing system fails to scale effectively in the presence ofreal-world requirements such as multi-homing for robust and redundantInternet access. A multi-homed site increases the amount of routingstate maintained globally by the Internet routing system. As thedemand for multi-homing continues to rise, the amount of routing statecontinues to grow.This dissertation presents the design of a new Internet routingarchitecture (NIRA) that simultaneously addresses these twoproblems. NIRA gives a user the ability to choose the sequence ofInternet service providers his packets traverse. It also has betterscaling characteristics than today's routing system. The design ofNIRA is decomposed into four modular components: route discovery,route availability discovery, route representation and packetforwarding, and provider compensation. This dissertation describesmechanisms to realize each of these components. It also makes clearthose places in the design where a globally agreed mechanism isneeded, and those places where alternative mechanisms can be designedand deployed locally. In particular, this dissertation describes ascalable route discovery mechanism. With this mechanism, a user onlyneeds to know a small region of the Internet in order to select aroute to reach a destination. In addition, a novel routerepresentation and packet forwarding scheme is designed such that asource and a destination address can uniquely represent a sequence ofproviders a packet traverses.Network measurement, simulation, and analytic modeling are used incombination to evaluate the design of NIRA. The evaluation suggeststhat NIRA is scalable.

Byzantine Fault Tolerance in Long-Lived Systems

2004年8月13日 00:00:00 GMT

Byzantine Fault Tolerance in Long-Lived Systems Rodrigues, Rodrigo; Liskov, Barbara This paper proposes counter-measures that can be deployedas part of a replicated system to reduce the size ofW, and thus reduce the class of attacks to which the system is vulnerable. Obviously it will not be possible to withstandall attacks via this technique, in particular attacks with verysmall A. But we will propose techniques that can reduceWto quite a small value.In the remainder of this paper, we discuss how to lowerthe value of W. We begin by discussing attacks. Then wediscuss some prior work in this area and why it is insufficient.The final section describes the approach we propose.

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management

2004年8月13日 00:00:00 GMT

EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State Management Leong, Ben; Liskov, Barbara; Demaine, Erik D. EpiChord is a DHT lookup algorithm that demonstrates that we canremove the O(log n)-state-per-node restriction on existing DHTtopologies to achieve significantly better lookup performance andresilience using a novel reactive routing state maintenance strategythat amortizes network maintenance costs into existing lookups and byissuing parallel queries. Our technique allows us to design a newclass of unlimited-state-per-node DHTs that is able to adapt naturallyto a wide range of lookup workloads. EpiChord is able to achieveO(1)-hop lookup performance under lookup-intensive workloads, and atleast O(log n)-hop lookup performance under churn-intensiveworkloads even in the worst case (though it is expected to performbetter on average).Our reactive routing state maintenance strategy allows us to maintainlarge amounts of routing state with only a modest amount of bandwidth,while parallel queries serve to reduce lookup latency and allow us toavoid costly lookup timeouts. In general, EpiChord exploits theinformation gleaned from observing lookup traffic to improve lookupperformance, and only sends network probes when necessary. Nodespopulate their caches mainly from observing network traffic, andcache entries are flushed from the cache after a fixed lifetime.Our simulations show that with our approach can reduce both lookuplatencies and pathlengths by a factor of 3 by issuing only 3 queriesasynchronously in parallel per lookup. Furthermore, we show that weare able to achieve this result with minimal additional communicationoverhead and the number of messages generated per lookup is no morethan that for the corresponding sequential Chord lookup algorithm overa range of lookup workloads. We also present a novel token-passingstabilization scheme that automatically detects and repairs globalrouting inconsistencies.

Early Sketch Processing with Application in HMM Based Sketch Recognition

2004年7月28日 00:00:00 GMT

Early Sketch Processing with Application in HMM Based Sketch Recognition Sezgin, Tevfik Metin; Davis, Randall Freehand sketching is a natural and crucial part of everyday humaninteraction, yet is almost totally unsupported by current user interfaces. With the increasing availability of tablet notebooks and pen based PDAs, sketchbased interaction has gained attention as a natural interaction modality.We are working to combine the flexibility and ease of use of paper and pencilwith the processing power of a computer, to produce a user interface fordesign that feels as natural as paper, yet is considerably smarter. One of themost basic tasks in accomplishing this is converting the original digitizedpen strokes in a sketch into the intended geometric objects. In this paper wedescribe an implemented system that combines multiple sources of knowledge toprovide robust early processing for freehand sketching. We also show how thisearly processing system can be used as part of a fast sketch recognition system with polynomial time segmentation and recognition algorithms.

Realistic Modeling of Simple and Complex Cell Tuning in the HMAXModel, and Implications for Invariant Object Recognition in Cortex

2004年7月27日 00:00:00 GMT

Realistic Modeling of Simple and Complex Cell Tuning in the HMAXModel, and Implications for Invariant Object Recognition in Cortex Serre, Thomas; Riesenhuber, Maximilian Riesenhuber \& Poggio recently proposed a model of object recognitionin cortex which, beyond integrating general beliefs about the visualsystem in a quantitative framework, made testable predictions aboutvisual processing. In particular, they showed that invariant objectrepresentation could be obtained with a selective pooling mechanismover properly chosen afferents through a {\sc max} operation: Forinstance, at the complex cells level, pooling over a group of simplecells at the same preferred orientation and position in space but atslightly different spatial frequency would provide scale tolerance,while pooling over a group of simple cells at the same preferredorientation and spatial frequency but at slightly different positionin space would provide position tolerance. Indirect support for suchmechanisms in the visual system come from the ability of thearchitecture at the top level to replicate shape tuning as well asshift and size invariance properties of ``view-tuned cells'' (VTUs)found in inferotemporal cortex (IT), the highest area in the ventralvisual stream, thought to be crucial in mediating object recognitionin cortex. There is also now good physiological evidence that a {\scmax} operation is performed at various levels along the ventralstream. However, in the original paper by Riesenhuber \& Poggio,tuning and pooling parameters of model units in early and intermediateareas were only qualitatively inspired by physiological data. Inparticular, many studies have investigated the tuning properties ofsimple and complex cells in primary visual cortex, V1. We show thatunits in the early levels of HMAX can be tuned to produce realisticsimple and complex cell-like tuning, and that the earlier findings onthe invariance properties of model VTUs still hold in this morerealistic version of the model.

Distribution Volume Tracking on Privacy-Enhanced Wireless Grid

2004年7月25日 00:00:00 GMT

Distribution Volume Tracking on Privacy-Enhanced Wireless Grid Uzuner, Ozlem In this paper, we discuss a wireless grid in which users are highly mobile, and form ad-hoc and sometimes short-lived connections with other devices. As they roam through networks, the users may choose to employ privacy-enhancing technologies to address their privacy needs and benefit from the computational power of the grid for a variety of tasks, including sharing content. The high rate of mobility of the users on the wireless grid, when combined with privacy enhancing mechanisms and ad-hoc connections, makes it difficult to conclusively link devices and/or individuals with network activities and to hold them liable for particular downloads. Protecting intellectual property in this scenario requires a solution that can work in absence of knowledge about behavior of particular individuals. Building on previous work, we argue for a solution that ensures proper compensation to content owners without inhibiting use and dissemination of works. Our proposal is based on digital tracking for measuring distribution volume of content and compensation of authors based on this accounting information. The emphasis is on obtaining good estimates of rate of popularity of works, without keeping track of activities of individuals or devices. The contribution of this paper is a revenue protection mechanism, Distribution Volume Tracking, that does not invade the privacy of users in the wireless grid and works even in the presence of privacy-enhancing technologies they may employ.

Discovering Latent Classes in Relational Data

2004年7月22日 00:00:00 GMT

Discovering Latent Classes in Relational Data Kemp, Charles; Griffiths, Thomas L.; Tenenbaum, Joshua B. We present a framework for learning abstract relational knowledge with the aimof explaining how people acquire intuitive theories of physical, biological, orsocial systems. Our approach is based on a generative relational model withlatent classes, and simultaneously determines the kinds of entities that existin a domain, the number of these latent classes, and the relations betweenclasses that are possible or likely. This model goes beyond previouspsychological models of category learning, which consider attributesassociated with individual categories but not relationships between categories.We apply this domain-general framework to two specific problems: learning thestructure of kinship systems and learning causal theories.

An Algorithm for Deciding BAPA: Boolean Algebra with Presburger Arithmetic

2004年7月19日 00:00:00 GMT

An Algorithm for Deciding BAPA: Boolean Algebra with Presburger Arithmetic Kuncak, Viktor; Nguyen, Huu Hai; Rinard, Martin We describe an algorithm for deciding the first-order multisorted theory BAPA, which combines 1) Boolean algebras of sets of uninterpreted elements (BA) and 2) Presburger arithmetic operations (PA). BAPA can express the relationship between integer variables and cardinalities of sets, and supports arbitrary quantification over both sets and integers.Our motivation for BAPA is deciding verification conditions that arise in the static analysis of data structure consistency properties. Data structures often use an integer variable to keep track of the number of elements they store; an invariant of such a data structure is that the value of the integer variable is equal to the number of elements stored in the data structure. When the data structure content is represented by a set, the resulting constraints can be captured in BAPA. BAPA formulas with quantifier alternations arise when annotations contain quantifiers themselves, or when proving simulation relation conditions for refinement and equivalence of program fragments. Furthermore, BAPA constraints can be used to extend the techniques for proving the termination of integer programs to programs that manipulate data structures, and have applications in constraint databases.We give a formal description of a decision procedure for BAPA, which implies the decidability of the satisfiability and validity problems for BAPA. We analyze our algorithm and obtain an elementary upper bound on the running time, thereby giving the first complexity bound for BAPA. Because it works by a reduction to PA, our algorithm yields the decidability of a combination of sets of uninterpreted elements with any decidable extension of PA. Our algorithm can also be used to yield an optimal decision procedure for BA though a reduction to PA with bounded quantifiers.We have implemented our algorithm and used it to discharge verification conditions in the Jahob system for data structure consistency checking of Java programs; our experience with the algorithm is promising.

Definition and Expansion of Composite Automata in IOA

2004年7月19日 00:00:00 GMT

Definition and Expansion of Composite Automata in IOA Tauber, Joshua A.; Garland, Stephen J. The IOA language provides notations for defining both primitive and composite I/O automata.This note describes, both formally and with examples, the constraints on these definitions, thecomposability requirements for the components of a composite automaton, and the transformationof a composite automaton into an equivalent primitive automaton.Section 2 introduces four examples used throughout this note to illustrate new definitions andoperations. Section 3 treats IOA programs for primitive I/O automata: it introduces notationsfor describing the syntactic structures that appear in these programs, and it lists syntactic andsemantic conditions that these programs must satisfy to represent valid primitive I/O automata.Section 4 describes how to reformulate primitive IOA programs into an equivalent but more regular(desugared) form that is used in later definitions in this note. Section 5 treats IOA programsfor composite I/O automata: it introduces notations for describing the syntactic structures thatappear in these programs, describes resortings induced by them, and lists syntactic and semanticconditions that these programs must satisfy to represent valid composite I/O automata. Section 6describes the translation of the name spaces of component automata into a unified name spacefor the composite automaton. Section 7 shows how to expand an IOA program for a compositeautomaton into an equivalent IOA program for a primitive automaton. The expansion is generatedby combining syntactic structures of the desugared programs for the component automata afterapplying appropriate replacements of sorts and variables. Section 8 details the expansion of thecomposite automaton introduced in Section 2 using the desugared forms developed throughoutSections 4Â–6 and the techniques described in Section 7. Finally, Section 9 gives a precise definitionof the resortings and substitutions used to replace sorts and variables.

Systematic Removal of Nondeterminism for Code Generation in I/O Automata

2004年7月19日 00:00:00 GMT

Systematic Removal of Nondeterminism for Code Generation in I/O Automata Vaziri, Mandana; Tauber, Joshua A.; Tsai, Michael J.; Lynch, Nancy The Input/Output (I/O) automaton model developed by Lynch and Tuttle models components in asynchronous concurrentsystems as labeled transition systems. IOA is a precise language for describing I/O automata and for stating their properties. A toolset is beingdeveloped for IOA to support distributed software design and implementation. One of the tools consists of a userassisted code generator fromIOA into an imperative programming language such as C or Java. One aspect that distinguishes IOA programs from programs written inimperative languages is the presence of nondeterminism which comesin the form of explicit nondeterministic statements and implicit scheduling choices made during execution. Code generation therefore consistspartially of systematically removing all forms of nondeterminism. In this paper, we describe our approach and design for code generation.We focus on the issue of removing implicit nondeterminism and specify a transformation on IOA programs that makes all nondeterminismexplicit. The programmer can then replace all explicit nondeterminismwith deterministic statements prior to code generation. We also describethis transformation at a semantic level i.e., at the level of the I/O automaton mathematical model. We show that the transformation definedat the IOA level conforms to the one at the semantic level.

Dynamically Resizable Static CMOS Logic for Fine-Grain Leakage

2004年7月12日 00:00:00 GMT

Dynamically Resizable Static CMOS Logic for Fine-Grain Leakage Heo, Seongmoo; Asanovic, Krste Digital circuits often have a critical path that runs through a smallsubset of the component subblocks, but where the path changes dynamicallyduring operation. Dynamically resizable static CMOS (DRCMOS) logic isproposed as a fine-grain leakage reduction technique that dynamicallydownsizes transistors in inactive subblocks while maintaining speed insubblocks along the current critical path. A 64-entry register free listand a 64-entry pick-two arbiter are used to evaluate DRCMOS. DRCMOS isshown to give a 50% reduction in total power for equal delay in a70 nm technology.

A Constant-Factor Approximation Algorithm for Embedding Unweighted Graphs into Trees

2004年7月05日 00:00:00 GMT

A Constant-Factor Approximation Algorithm for Embedding Unweighted Graphs into Trees Badoiu, Mihai; Indyk, Piotr; Sidiropoulos, Anastasios We present a constant-factor approximation algorithm for computing anembedding of the shortest path metric of an unweighted graph into atree, that minimizes the multiplicative distortion.

Optimal Approximations of the Frequency Moments

2004年7月02日 00:00:00 GMT

Optimal Approximations of the Frequency Moments Indyk, Piotr; Woodruff, David We give a one-pass, O~(m^{1-2/k})-space algorithm for estimating the k-th frequency moment of a data stream for any real k>2. Together with known lower bounds, this resolves the main problem left open by Alon, Matias, Szegedy, STOC'96. Our algorithm enables deletions as well as insertions of stream elements.

Contextual models for object detection using boosted random fields

2004年6月25日 00:00:00 GMT

Contextual models for object detection using boosted random fields Torralba, Antonio; Murphy, Kevin P.; Freeman, William T. We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes.

Middleboxes No Longer Considered Harmful

2004年6月24日 00:00:00 GMT

Middleboxes No Longer Considered Harmful Walfish, Michael; Stribling, Jeremy; Krohn, Maxwell; Balakrishnan, Hari; Morris, Robert; Shenker, Scott Intermediate network elements, such as network address translators (NATs), firewalls, and transparent caches are now commonplace. The usual reaction in the network architecture community to these so-called middleboxes is a combination of scorn (because they violate important architectural principles) and dismay (because these violations make the Internet less flexible). While we acknowledge these concerns, we also recognize that middleboxes have become an Internet fact of life for important reasons. To retain their functions while eliminating their dangerous side-effects, we propose an extension to the Internet architecture, called the Delegation-Oriented Architecture (DOA), that not only allows, but also facilitates, the deployment of middleboxes. DOA involves two relatively modest changes to the current architecture: (a) a set of references that are carried in packets and serve as persistent host identifiers and (b) a way to resolve these references to delegates chosen by the referenced host.

How People Re-find Information When the Web Changes

2004年6月18日 00:00:00 GMT

How People Re-find Information When the Web Changes Teevan, Jaime This paper investigates how people return to information in a dynamic information environment. For example, a person might want to return to Web content via a link encountered earlier on a Web page, only to learn that the link has since been removed. Changes can benefit users by providing new information, but they hinder returning to previously viewed information. The observational study presented here analyzed instances, collected via a Web search, where people expressed difficulty re-finding information because of changes to the information or its environment. A number of interesting observations arose from this analysis, including that the path originally taken to get to the information target appeared important in its re-retrieval, whereas, surprisingly, the temporal aspects of when the information was seen before were not. While people expressed frustration when problems arose, an explanation of why the change had occurred was often sufficient to allay that frustration, even in the absence of a solution. The implications of these observations for systems that support re-finding in dynamic environments are discussed.

Building Grounded Abstractions for Artificial Intelligence Programming

2004年6月16日 00:00:00 GMT

Building Grounded Abstractions for Artificial Intelligence Programming Hearn, Robert A. Most Artificial Intelligence (AI) work can be characterized as either ``high-level'' (e.g., logical, symbolic) or ``low-level'' (e.g., connectionist networks, behavior-based robotics). Each approach suffers from particular drawbacks. High-level AI uses abstractions that often have no relation to the way real, biological brains work. Low-level AI, on the other hand, tends to lack the powerful abstractions that are needed to express complex structures and relationships. I have tried to combine the best features of both approaches, by building a set of programming abstractions defined in terms of simple, biologically plausible components. At the ``ground level'', I define a primitive, perceptron-like computational unit. I then show how more abstract computational units may be implemented in terms of the primitive units, and show the utility of the abstract units in sample networks. The new units make it possible to build networks using concepts such as long-term memories, short-term memories, and frames. As a demonstration of these abstractions, I have implemented a simulator for ``creatures'' controlled by a network of abstract units. The creatures exist in a simple 2D world, and exhibit behaviors such as catching mobile prey and sorting colored blocks into matching boxes. This program demonstrates that it is possible to build systems that can interact effectively with a dynamic physical environment, yet use symbolic representations to control aspects of their behavior.

Versatility and VersaBench: A New Metric and a Benchmark Suite for Flexible Architectures

2004年6月14日 00:00:00 GMT

Versatility and VersaBench: A New Metric and a Benchmark Suite for Flexible Architectures Rabbah, Rodric M.; Bratt, Ian; Asanovic, Krste; Agarwal, Anant For the last several decades, computer architecture research has largely benefited from, and continues to be driven by ad-hoc benchmarking. Often the benchmarks are selected to represent workloads that architects believe should run on the computational platforms they design. For example, benchmark suites such as SPEC, Winstone, and MediaBench, which represent workstation, desktop and media workloads respectively, have influenced computer architecture innovation for the last decade. Recently, advances in VLSI technology have created an increasing interest within the computer architecture community to build a new kind of processor that is more flexible than extant general purpose processors. Such new processor architectures must efficiently support a broad class of applications including graphics, networking, and signal processing in addition to the traditional desktop workloads. Thus, given the new focus on flexibility demands, a new benchmark suite and new metrics are necessary to accurately reflect the goals of the architecture community. This paper thus proposes VersaBench as a new benchmark suite, and a new Versatility measure to characterize architectural flexibility, or in other words, the ability of the architecture to effectively execute a wide array of workloads. The benchmark suite is composed of applications drawn from several domains including desktop, server, stream, and bit-level processing. The Versatility measure is a single scalar metric inspired by the SPEC paradigm. It normalizes processor performance on each benchmark by that of the highest-performing machine for that application. This paper reports the measured versatility for several existing processors, as well as for some new and emerging research processors. The benchmark suite is freely distributed, and we are actively cataloging and sharing results for various reference processors.

Scalar Operand Networks: Design, Implementation, and Analysis

2004年6月08日 00:00:00 GMT

Scalar Operand Networks: Design, Implementation, and Analysis Taylor, Michael Bedford; Lee, Walter; Amarasinghe, Saman; Agarwal, Anant The bypass paths and multiported register files in microprocessors serve as an implicit interconnect tocommunicate operand values among pipeline stages and multiple ALUs. Previous superscalar designs implementedthis interconnect using centralized structures that do not scale with increasing ILP demands. Insearch of scalability, recent microprocessor designs in industry and academia exhibit a trend toward distributedresources such as partitioned register files, banked caches, multiple independent compute pipelines,and even multiple program counters. Some of these partitioned microprocessor designs have begun to implementbypassing and operand transport using point-to-point interconnects. We call interconnects optimizedfor scalar data transport, whether centralized or distributed, scalar operand networks. Although thesenetworks share many of the challenges of multiprocessor networks such as scalability and deadlock avoidance,they have many unique requirements, including ultra-low latencies (a few cycles versus tens of cycles)and ultra-fast operation-operand matching. This paper discusses the unique properties of scalar operandnetworks (SONs), examines alternative ways of implementing them, and introduces the AsTrO taxonomy todistinguish between them. It discusses the design of two alternative networks in the context of the Raw microprocessor,and presents detailed timing, area and energy statistics for a real implementation. The paperalso presents a 5-tuple performance model for SONs and analyzes their performance sensitivity to networkproperties for ILP workloads.

Deionizer: A Tool for Capturing and Embedding I/O Cells

2004年6月07日 00:00:00 GMT

Deionizer: A Tool for Capturing and Embedding I/O Cells Taylor, Michael Bedford In this paper, we introduce the concept of a deionizer. A deionizeris a special type of partial evaluator whose purpose is to create a newversion of a program that can run without accessing a partial set of I/O resources.Although a deionizer can be used for application embedding, this short paper addresses the use of dionization for improving benchmark accuracy.The paper briefly discusses the key ideas and then explains the implementation and useof the MIT deionizer. This deionizer was used to produce the results for a recent conference paper that compares theRaw processor to a Pentium III.

BioJADE: A Design and Simulation Tool for Synthetic Biological Systems

2004年5月28日 00:00:00 GMT

BioJADE: A Design and Simulation Tool for Synthetic Biological Systems Goler, Jonathan A. The next generations of both biological engineering and computer engineering demand that control be exerted at the molecular level. Creating, characterizing and controlling synthetic biological systems may provide us with the ability to build cells that are capable of a plethora of activities, from computation to synthesizing nanostructures. To develop these systems, we must have a set of tools not only for synthesizing systems, but also designing and simulating them. The BioJADE project provides a comprehensive, extensible design and simulation platform for synthetic biology. BioJADE is a graphical design tool built in Java, utilizing a database back end, and supports a range of simulations using an XML communication protocol. BioJADE currently supports a library of over 100 parts with which it can compile designs into actual DNA, and then generate synthesis instructions to build the physical parts. The BioJADE project contributes several tools to Synthetic Biology. BioJADE in itself is a powerful tool for synthetic biology designers. Additionally, we developed and now make use of a centralized BioBricks repository, which enables the sharing of BioBrick components between researchers, and vastly reduces the barriers to entry for aspiring Synthetic Biologists.

Data Structure Repair Using Goal-Directed Reasoning

2004年5月18日 00:00:00 GMT

Data Structure Repair Using Goal-Directed Reasoning Demsky, Brian; Rinard, Martin Model-based data structure repair is a promising techniquefor enabling programs to continue to execute successfullyin the face of otherwise fatal data structure corruption errors.Previous research in this eld relied on the developerto write a speci cation to explicitly translate model repairsinto concrete data structure repairs, raising the possibilityof 1) incorrect translations causing the supposedly repairedconcrete data structures to be inconsistent, and 2) repairedmodels with no corresponding concrete data structure representation.We present a new repair algorithm that uses goal-directedreasoning to automatically translate model repairs into concretedata structure repairs. This new repair algorithm eliminatesthe possibility of incorrect translations and repairedmodels with no corresponding representation as concretedata structures. Unlike our old algorithm, our new algorithmcan also repair linked data structures such as a list ora tree.

Learning Commonsense Categorical Knowledge in a Thread Memory System

2004年5月18日 00:00:00 GMT

Learning Commonsense Categorical Knowledge in a Thread Memory System Stamatoiu, Oana L. If we are to understand how we can build machines capable of broadpurpose learning and reasoning, we must first aim to build systemsthat can represent, acquire, and reason about the kinds of commonsenseknowledge that we humans have about the world. This endeavor suggestssteps such as identifying the kinds of knowledge people commonly haveabout the world, constructing suitable knowledge representations, andexploring the mechanisms that people use to make judgments about theeveryday world. In this work, I contribute to these goals by proposingan architecture for a system that can learn commonsense knowledgeabout the properties and behavior of objects in the world. Thearchitecture described here augments previous machine learning systemsin four ways: (1) it relies on a seven dimensional notion of context,built from information recently given to the system, to learn andreason about objects' properties; (2) it has multiple methods that itcan use to reason about objects, so that when one method fails, it canfall back on others; (3) it illustrates the usefulness of reasoningabout objects by thinking about their similarity to other, betterknown objects, and by inferring properties of objects from thecategories that they belong to; and (4) it represents an attempt tobuild an autonomous learner and reasoner, that sets its own goals forlearning about the world and deduces new facts by reflecting on itsacquired knowledge. This thesis describes this architecture, as wellas a first implementation, that can learn from sentences such as ``Ablue bird flew to the tree'' and ``The small bird flew to the cage''that birds can fly. One of the main contributions of thiswork lies in suggesting a further set of salient ideas about how wecan build broader purpose commonsense artificial learners andreasoners.

Generative Temporal Planning with Complex Processes

2004年5月18日 00:00:00 GMT

Generative Temporal Planning with Complex Processes Kennell, Jonathan Autonomous vehicles are increasingly being used in mission-critical applications, and robust methods are needed for controlling these inherently unreliable and complex systems. This thesis advocates the use of model-based programming, which allows mission designers to program autonomous missions at the level of a coach or wing commander. To support such a system, this thesis presents the Spock generative planner. To generate plans, Spock must be able to piece together vehicle commands and team tactics that have a complex behavior represented by concurrent processes. This is in contrast to traditional planners, whose operators represent simple atomic or durative actions. Spock represents operators using the RMPL language, which describes behaviors using parallel and sequential compositions of state and activity episodes. RMPL is useful for controlling mobile autonomous missions because it allows mission designers to quickly encode expressive activity models using object-oriented design methods and an intuitive set of activity combinators. Spock also is significant in that it uniformly represents operators and plan-space processes in terms of Temporal Plan Networks, which support temporal flexibility for robust plan execution. Finally, Spock is implemented as a forward progression optimal planner that walks monotonically forward through plan processes, closing any open conditions and resolving any conflicts. This thesis describes the Spock algorithm in detail, along with example problems and test results.

Verifying the Correctness of Wide-Area Internet Routing

2004年5月17日 00:00:00 GMT

Verifying the Correctness of Wide-Area Internet Routing Feamster, Nick; Balakrishnan, Hari Several studies have shown that wide-area Internet routing is fragile, with failures occurring for a variety of reasons. Routing fragility is largely due to the flexible and powerful ways in which BGP can be configured to perform various tasks, which range from implementing the policies of commercial relationships to configuring backup paths. Configuring routers in an AS is like writing a distributed program, and BGP's flexible configuration and today's relatively low-level configuration languages make the process error-prone. The primary method used by operators to determine whether their complex configurations are correct is to try them out in operation.We believe that there is a need for a systematic approach to verifying router configurations before they are deployed. This paper develops a static analysis framework for configuration checking, and uses it in the design of rcc, a ``router configuration checker''. rcc takes as input a set of router configurations and flags anomalies and errors, based on a set of well-defined correctness conditions. We have used rcc to check BGP configurations from 9 operational networks, testing nearly 700 real-world router configurations in the process. Every network we analyzed had configuration errors, some of which were potentially serious and had previously gone unnoticed. Our analysis framework and results also suggest ways in which BGP and configuration languages should be improved. rcc has also been downloaded by 30 network operators to date.

A Combined Pointer and Purity Analysis for Java Programs

2004年5月17日 00:00:00 GMT

A Combined Pointer and Purity Analysis for Java Programs Salcianu, Alexandru; Rinard, Martin We present a new method purity analysis for Java programs.A method is pure if it does not mutate any location that exists in the program state right before method invocation.Our analysis is built on top of a combined pointer and escape analysis for Java programs and is capable of determining that methods are pure even when the methods do heap mutation, provided that the mutation affects only objects created after the beginning of the method. Because our analysis extracts a precise representation of the region of the heap that each method may access, it is able to provide useful information even for methods with externally visible side effects. In particular, it can recognize read-only parameters (a parameter is read-only if the method does not mutate any objects transitively reachable from the parameter) and safe parameters (a parameter is safe if it is read-only and the method does not create any new externally visible paths in the heap to objects transitively reachable from the parameter). The analysis can also generate regular expressions that characterize the externally visible heap locations that the method mutates.We have implemented our analysis and used it to analyze several data structure implementations. Our results show that our analysis effectively recognize a variety of pure methods, including pure methods that allocate and mutate complex auxiliary data structures. Even if the methods are not pure, our analysis can provide information which may enable developers to usefully bound the potential side effects of the method.

Video Matching

2004年5月11日 00:00:00 GMT

Video Matching Sand, Peter; Teller, Seth This paper describes a method for bringing two videos (recorded at different times) into spatiotemporal alignment, then comparing and combining corresponding pixels for applications such as background subtraction, compositing, and increasing dynamic range. We align a pair of videos by searching for frames that best match according to a robust image registration process. This process uses locally weighted regression to interpolate and extrapolate high-likelihood image correspondences, allowing new correspondences to be discovered and refined. Image regions that cannot be matched are detected and ignored, providing robustness to changes in scene content and lighting, which allows a variety of new applications.

On Verifying a File System Implementation

2004年5月06日 00:00:00 GMT

On Verifying a File System Implementation Arkoudas, Konstantine; Zee, Karen; Kuncak, Viktor; Rinard, Martin We present a correctness proof for a basic file system implementation. This implementation contains key elements of standard Unix file systems such as inodes and fixed-size disk blocks. We prove the implementation correct by establishing a simulation relation between the specification of the file system (which models the file system as an abstract map from file names to sequences of bytes) and its implementation (which uses fixed-size disk blocks to store the contents of the files).We used the Athena proof checker to represent and validate our proof. Our experience indicates that Athena's use of block-structured natural deduction, support for structural induction and proof abstraction, and seamless connection with high-performance automated theorem provers were essential to our ability to successfully manage a proof of this size.

Can Basic ML Techniques Illuminate Rateless Erasure Codes?

2004年5月05日 00:00:00 GMT

Can Basic ML Techniques Illuminate Rateless Erasure Codes? Gupta, Anjali; Krohn, Maxwell; Walfish, Michael The recently developed rateless erasure codes are a near-optimal channel coding technique that guaranteeslow overhead and fast decoding. The underlying theory, and current implementations, of thesecodes assume that a network transmitter encodes according to a pre-specified probability distribution.In this report, we use basic Machine Learning techniques to try to understand what happens when thisassumption is false. We train several classes of models using certain features that describe the empiricaldistribution realized at a network receiver, and we investigate whether these models can Â“learnÂ” topredict whether a given encoding will require extra overhead. Our results are mixed.

A Unified Statistical and Information Theoretic Framework for Multi-modal Image Registration

2004年4月28日 00:00:00 GMT

A Unified Statistical and Information Theoretic Framework for Multi-modal Image Registration Zollei, Lilla; Fisher, John; Wells, William We formulate and interpret several multi-modal registration methods inthe context of a unified statistical and information theoretic framework. A unified interpretation clarifies the implicit assumptionsof each method yielding a better understanding of their relativestrengths and weaknesses. Additionally, we discuss a generativestatistical model from which we derive a novel analysis tool, the"auto-information function", as a means of assessing and exploiting thecommon spatial dependencies inherent in multi-modal imagery. Weanalytically derive useful properties of the "auto-information" aswell as verify them empirically on multi-modal imagery. Among theuseful aspects of the "auto-information function" is that it canbe computed from imaging modalities independently and it allows one todecompose the search space of registration problems.

Rotation Invariant Object Recognition from One Training Example

2004年4月27日 00:00:00 GMT

Rotation Invariant Object Recognition from One Training Example Yokono, Jerry Jun; Poggio, Tomaso Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. Such a descriptor--based on a set of oriented Gaussian derivative filters-- is used in our recognition system. We report here an evaluation of several techniques for orientation estimation to achieve rotation invariance of the descriptor. We also describe feature selection based on a single training image. Virtual images are generated by rotating and rescaling the image and robust features are selected. The results confirm robust performance in cluttered scenes, in the presence of partial occlusions, and when the object is embedded in different backgrounds.

Light-Weight Leases for Storage-Centric Coordination

2004年4月22日 00:00:00 GMT

Light-Weight Leases for Storage-Centric Coordination Chockler, Gregory; Malkhi, Dahlia We propose light-weight lease primitives to leverage fault-tolerant coordination among clients accessing a shared storage infrastructure (such as network attached disks or storage servers). In our approach, leases are implemented from the very shared data that they protect. That is, there is no global lease manager, there is a lease per data item (e.g., a file, a directory, a disk partition, etc.) or a collection thereof. Our lease primitives are useful for facillitating exculsive access to data in systems satisfying certain timeliness constraints. In addition, they can be utilized as a building block for implementing dependable services resilient to timing failures. In particular, we show a simple lease based solution for fault-tolerant Consensus which is a benchmark distributed coordination problem.

Cascading Regularized Classifiers

2004年4月21日 00:00:00 GMT

Cascading Regularized Classifiers Perez-Breva, Luis Among the various methods to combine classifiers, Boosting was originally thought as an stratagem to cascade pairs of classifiers through their disagreement. I recover the same idea from the work of Niyogi et al. to show how to loosen the requirement of weak learnability, central to Boosting, and introduce a new cascading stratagem. The paper concludes with an empirical study of an implementation of the cascade that, under assumptions that mirror the conditions imposed by Viola and Jones in [VJ01], has the property to preserve the generalization ability of boosting.

M&M: A Passive Toolkit for Measuring, Correlating, and Tracking Path Characteristics

2004年4月14日 00:00:00 GMT

M&M: A Passive Toolkit for Measuring, Correlating, and Tracking Path Characteristics Katti, Sachin; Katabi, Dina; Kohler, Eddie; Strauss, Jacob This paper presents M&M, a passive measurement toolkitsuitable for large-scale studies of Internet path characteristics.The multiQ tool uses equally-spaced mode gaps in TCP flowsÂ’packet interarrival time distributions to detect multiple bottleneckcapacities and their relative order. Unlike previous tools,multiQ can discover up to three bottlenecks fromthe tcpdumptrace of a single flow, and can work with acknowledgment aswell as data interarrivals.We also describe the mystery tool, asimple TCP loss event, packet loss, and RTT analyzer designedto work in concert with multiQ. The M&M toolkit can measuresimple path properties; correlate different types of measurementof the same path, producing new kinds of results; andbecause M&M is passive, it can use publicly-available traces totrack the value of a measurement over multiple years.We validate our tools in depth using the RON overlay network[4], which provides more than 400 heterogeneous Internetpaths and detailed information about their characteristics.We compare multiQ with Nettimer and Pathrate, two othercapacity measurement tools, in the first wide-area, real-worldvalidation of capacity measurement techniques. Each tool accuratelydiscovers minimum capacities (85% of measurementsare within 10%of the true value); multiQ additionally discoversmultiple bottlenecks and their orderings. We also use ourtoolkit to perform several measurement studies using a reservoirof 375 million traced packets spanning the last two years.Among the results of these studies are that bottleneck capacityon our traced links has gone up by around an order ofmagnitudefrom 2002 to 2004, and that differences in levels of statisticalmultiplexing on 10 Mb/s and 100 Mb/s bottleneck links resultin flows over those links having similar fair-share bandwidths.

A 1020-Node Modular Microphone Array and Beamformer for Intelligent Computing Spaces

2004年4月14日 00:00:00 GMT

A 1020-Node Modular Microphone Array and Beamformer for Intelligent Computing Spaces Weinstein, Eugene; Steele, Kenneth; Agarwal, Anant; Glass, James Ubiquitous computing environments are characterized by an unboundedamount of noise and crosstalk. In these environments, traditionalmethods of sound capture are insufficient, and array microphones areneeded in order to obtain a clean recording of desired speech. In thiswork, we have designed, implemented, and tested LOUD, a novel 1020-nodemicrophone array utilizing the Raw tile parallel processorarchitecture for computation. To the best of our knowledge,this is currently the largest microphone array in the world. We haveexplored the uses of the array within ubiquitous computing scenarios byimplementing an acoustic beamforming algorithm for sound sourceamplification in a noisy environment, and have obtained preliminaryresults demonstrating the efficacy of the array. From one to 1020microphones, we have shown a 13.7dB increase in peak SNR on arepresentative utterance, an 87.2% drop in word error rate withinterferer present, and an 89.6% drop in WER without an interferer.

Contextual Influences on Saliency

2004年4月14日 00:00:00 GMT

Contextual Influences on Saliency Torralba, Antonio This article describes a model for including scene/context priors in attention guidance. In the proposed scheme, visual context information can be available early in the visual processing chain, in order to modulate the saliency of image regions and to provide an efficient short cut for object detection and recognition. The scene is represented by means of a low-dimensional global description obtained from low-level features. The global scene features are then used to predict the probability of presence of the target object in the scene, and its location and scale, before exploring the image. Scene information can then be used to modulate the saliency of image regions early during the visual processing in order to provide an efficient short cut for object detection and recognition.

A Quantitative Comparison of Reconfigurable, Tiled, and Conventional Architectures on Bit-level Computation

2004年4月13日 00:00:00 GMT

A Quantitative Comparison of Reconfigurable, Tiled, and Conventional Architectures on Bit-level Computation Wentzlaff, David; Agarwal, Anant General purpose computing architectures are being called on to work on amore diverse application mix every day. This has been fueled by the needfor reduced time to market and economies of scale that are the hallmarksof software on general purpose microprocessors. As this application mixexpands, application domains such as bit-level computation, which hasprimarily been the domain of ASICs and FPGAs, will need to be effectivelyhandled by general purpose hardware. Examples of bit-level applicationsinclude Ethernet framing, forward error correction encoding/decoding, andefficient state machine implementation.In this paper we compare how differing computational structures such asASICs, FPGAs, tiled architectures, and superscalar microprocessors areable to compete on bit-level communication applications. A quantitativecomparison in terms of absolute performance and performance per area willbe presented. These results show that although modest gains~(2-3x) inabsolute performance can be achieved when using FPGAs versus tunedmicroprocessor implementations, it is the significantly larger gains~(2-3orders of magnitude) that can be achieved in performance per area thatwill motivate work on supporting bit-level computation in a generalpurpose fashion in the future.

Long-Lived Rambo: Trading Knowledge for Communication

2004年4月12日 00:00:00 GMT

Long-Lived Rambo: Trading Knowledge for Communication Georgiou, Chryssis; Musial, Peter M.; Shvartsman, Alexander A. Shareable data services providing consistency guarantees, such as atomicity (linearizability), make building distributedsystems easier. However, combining linearizability with efficiency in practical algorithms is difficult. A reconfigurablelinearizable data service, called Rambo, was developed by Lynch and Shvartsman. This service guarantees consistencyunder dynamic conditions involving asynchrony, message loss, node crashes, and new node arrivals. The specificationof the original algorithm is given at an abstract level aimed at concise presentation and formal reasoning aboutcorrectness. The algorithm propagates information by means of gossip messages. If the service is in use for along time, the size and the number of gossip messages may grow without bound. This paper presents a consistentdata service for long-lived objects that improves on Rambo in two ways: it includes an incremental communicationprotocol and a leave service. The new protocol takes advantage of the local knowledge, and carefully manages thesize of messages by removing redundant information, while the leave service allows the nodes to leave the systemgracefully. The new algorithm is formally proved correct by forward simulation using levels of abstraction. Anexperimental implementation of the system was developed for networks-of-workstations. The paper also includesselected analytical and preliminary empirical results that illustrate the advantages of the new algorithm.

On Generalized Records and Spatial Conjunction in Role Logic

2004年4月06日 00:00:00 GMT

On Generalized Records and Spatial Conjunction in Role Logic Kuncak, Viktor; Rinard, Martin We have previously introduced role logic as a notation fordescribing properties of relational structures in shapeanalysis, databases and knowledge bases. A natural fragmentof role logic corresponds to two-variable logic withcounting and is therefore decidable.We show how to use role logic to describe open and closedrecords, as well the dual of records, inverse records. Weobserve that the spatial conjunction operation of separationlogic naturally models record concatenation. Moreover, weshow how to eliminate the spatial conjunction of formulas ofquantifier depth one in first-order logic with counting. Asa result, allowing spatial conjunction of formulas ofquantifier depth one preserves the decidability oftwo-variable logic with counting. This result applies totwo-variable role logic fragment as well.The resulting logic smoothly integrates type system andpredicate calculus notation and can be viewed as a naturalgeneralization of the notation for constraints arising inrole analysis and similar shape analysis approaches.

Converting Java Programs to Use Generic Libraries

2004年3月30日 00:00:00 GMT

Converting Java Programs to Use Generic Libraries Donovan, Alan; Kiezun, Adam; Tschantz, Matthew S.; Ernst, Michael D. Java 1.5 will include a type system (called JSR-14) that supports parametric polymorphism, or generic classes. This will bring many benefits to Java programmers, not least because current Java practice makes heavy use of logically-generic classes, including container classes.Translation of Java source code into semantically equivalent JSR-14 source code requires two steps: parameterization (adding type parameters to class definitions) and instantiation (adding the type arguments at each use of a parameterized class). Parameterization need be done only once for a class, whereas instantiation must be performed for each client, of which there are potentially many more. Therefore, this work focuses on the instantiation problem. We present a technique to determine sound and precise JSR-14 types at each use of a class for which a generic type specification is available. Our approach uses a precise and context-sensitive pointer analysis to determine possible types at allocation sites, and a set-constraint-based analysis (that incorporates guarded, or conditional, constraints) to choose consistent types for both allocation and declaration sites. The technique handles all features of the JSR-14 type system, notably the raw types that provide backward compatibility. We have implemented our analysis in a tool that automatically inserts type parameters into Java code, and we report its performance when applied to a number of real-world Java programs.

Predicting Problems Caused by Component Upgrades

2004年3月30日 00:00:00 GMT

Predicting Problems Caused by Component Upgrades McCamant, Stephen; Ernst, Michael D. This report presents a new, automatic technique to assess whether replacing a component of a softwaresystem by a purportedly compatible component may change the behavior of the system. The techniqueoperates before integrating the new component into the system or running system tests, permitting quickerand cheaper identification of problems. It takes into account the systemÂ’s use of the component, becausea particular component upgrade may be desirable in one context but undesirable in another. No formalspecifications are required, permitting detection of problems due either to errors in the component or toerrors in the system. Both external and internal behaviors can be compared, enabling detection of problemsthat are not immediately reflected in the output.The technique generates an operational abstraction for the old component in the context of the system,and one for the new component in the context of its test suite. An operational abstraction is a set of programproperties that generalizes over observed run-time behavior. Modeling a system as divided into modules,and taking into account the control and data flow between the modules, we formulate a logical conditionto guarantee that the systemÂ’s behavior is preserved across a component replacement. If automated logicalcomparison indicates that the new component does not make all the guarantees that the old one did, thenthe upgrade may affect system behavior and should not be performed without further scrutiny.We describe a practical implementation of the technique, incorporating enhancements to handle nonlocalstate, non-determinism, and missing test suites, and to distinguish old from new incompatibilities. Weevaluate the implementation in case studies using real-world systems, including the Linux C library and 48Unix programs. Our implementation identified real incompatibilities among versions of the C library thataffected some of the programs, and it approved the upgrades for other programs that were unaffected by thechanges.This report is a revision of the first authorÂ’s MasterÂ’s thesis, submitted January 2004.

Evaluation of sets of oriented and non-oriented receptive fields as local descriptors

2004年3月24日 00:00:00 GMT

Evaluation of sets of oriented and non-oriented receptive fields as local descriptors Yokono, Jerry Jun; Poggio, Tomaso Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. We propose a performance criterion for a local descriptor based on the tradeoff between selectivity and invariance. In this paper, we evaluate several local descriptors with respect to selectivity and invariance. The descriptors that we evaluated are Gaussian derivatives up to the third order, gray image patches, and Laplacian-based descriptors with either three scales or one scale filters. We compare selectivity and invariance to several affine changes such as rotation, scale, brightness, and viewpoint. Comparisons have been made keeping the dimensionality of the descriptors roughly constant. The overall results indicate a good performance by the descriptor based on a set of oriented Gaussian filters. It is interesting that oriented receptive fields similar to the Gaussian derivatives as well as receptive fields similar to the Laplacian are found in primate visual cortex.

Predicting Unroll Factors Using Nearest Neighbors

2004年3月22日 00:00:00 GMT

Predicting Unroll Factors Using Nearest Neighbors Stephenson, Mark; Amarasinghe, Saman In order to deliver the promise of MooreÂ’s Law to the enduser, compilers must make decisions that are intimately tiedto a specific target architecture. As engineers add architecturalfeatures to increase performance, systems becomeharder to model, and thus, it becomes harder for a compilerto make effective decisions.Machine-learning techniques may be able to help compilerwriters model modern architectures. Because learning techniquescan effectively make sense of high dimensional spaces,they can be a valuable tool for clarifying and discerningcomplex decision boundaries. In our work we focus on loopunrolling, a well-known optimization for exposing instructionlevel parallelism. Using the Open Research Compileras a testbed, we demonstrate how one can use supervisedlearning techniques to model the appropriateness of loopunrolling.We use more than 1,100 loops Â— drawn from 46 benchmarksÂ— to train a simple learning algorithm to recognizewhen loop unrolling is advantageous. The resulting classifiercan predict with 88% accuracy whether a novel loop(i.e., one that was not in the training set) benefits fromloop unrolling. Furthermore, we can predict the optimal ornearly optimal unroll factor 74% of the time. We evaluatethe ramifications of these prediction accuracies using theOpen Research Compiler (ORC) and the Itanium r 2 architecture.The learned classifier yields a 6% speedup (overORCÂ’s unrolling heuristic) for SPEC benchmarks, and a 7%speedup on the remainder of our benchmarks. Because thelearning techniques we employ run very quickly, we wereable to exhaustively determine the four most salient loopcharacteristics for determining when unrolling is beneficial.

REED: Robust, Efficient Filtering and Event Detection in Sensor Networks

2004年3月22日 00:00:00 GMT

REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Abadi, Daniel J.; Madden, Samuel R. This paper presents an algorithm for handling many types of filters insensor networks that cannot be expressed using a simple predicate.Specifically, the action of the filter may be predicated on sensor produceddata where an entire table of sensor-data/result-value pairs are needed toresolve the filter. We describe and evaluate three algorithms that canperform these filters that take advantage of database distributed jointechniques. Our join-based algorithms are capable of running in verylimited amounts of RAM, can distribute the storage burden over groups ofnodes, and are tolerant to dropped packets and node failures. REED isthus suitable for a wide range of event-detection applications thattraditional sensor network database and data collection systems cannot beused to implement.

Face processing in humans is compatible with a simple shape-based model of vision

2004年3月05日 00:00:00 GMT

Face processing in humans is compatible with a simple shape-based model of vision Riesenhuber; Jarudi; Gilad; Sinha Understanding how the human visual system recognizes objects is one of the key challenges in neuroscience. Inspired by a large body of physiological evidence (Felleman and Van Essen, 1991; Hubel and Wiesel, 1962; Livingstone and Hubel, 1988; Tso et al., 2001; Zeki, 1993), a general class of recognition models has emerged which is based on a hierarchical organization of visual processing, with succeeding stages being sensitive to image features of increasing complexity (Hummel and Biederman, 1992; Riesenhuber and Poggio, 1999; Selfridge, 1959). However, these models appear to be incompatible with some well-known psychophysical results. Prominent among these are experiments investigating recognition impairments caused by vertical inversion of images, especially those of faces. It has been reported that faces that differ Â“featurallyÂ” are much easier to distinguish when inverted than those that differ Â“configurallyÂ” (Freire et al., 2000; Le Grand et al., 2001; Mondloch et al., 2002) Â– a finding that is difficult to reconcile with the aforementioned models. Here we show that after controlling for subjectsÂ’ expectations, there is no difference between Â“featurallyÂ” and Â“configurallyÂ” transformed faces in terms of inversion effect. This result reinforces the plausibility of simple hierarchical models of object representation and recognition in cortex.

Virtual Mobile Nodes for Mobile Ad Hoc Networks

2004年2月26日 00:00:00 GMT

Virtual Mobile Nodes for Mobile Ad Hoc Networks Dolev, Shlomi; Gilbert, Seth; Lynch, Nancy A.; Schiller, Elad; Shvarstman, Alex A.; Welch, Jennifer One of the most significant challenges introduced by mobile networks is the difficulty in coping withthe unpredictable movement of mobile nodes. If, instead, the mobile nodes could be programmed totravel through the world in a predictable and useful manner, the task of designing algorithms for mobilenetworks would be significantly simplified. Alas, users of mobile devices in the real world are notamenable to following instructions as to where their devices may travel.While real mobile nodes may be disinclined to move as desired, we propose executing algorithmson virtual mobile nodes that move in a predetermined, predictable, manner through the real world. Inthis paper, we define the Virtual Mobile Node Abstraction, and present selected algorithms that takeadvantage of virtual mobile nodes to simply and efficiently perform complicated tasks in highly dynamic,unpredictable mobile ad hoc networks.We then present the Mobile Point Emulator, a new algorithm that implements robust virtual mobilenodes. This algorithm replicates the virtual node at a constantly changing set of real nodes, choosingnew replicas as the real nodes move in and out of the path of the virtual node. We claim that the MobilePoint algorithm correctly implements a virtual mobile node, and that it is robust as long as the virtualnode travels through well-populated areas of the network.

GeoQuorums: Implementing Atomic Memory in Mobile Ad Hoc Networks

2004年2月25日 00:00:00 GMT

GeoQuorums: Implementing Atomic Memory in Mobile Ad Hoc Networks Dolev, Shlomi; Gilbert, Seth; Lynch, Nancy A.; Shvartsman, Alex A.; Welch, Jennifer L. We present a new approach, the GeoQuorums approach, for implementing atomic read/write shared memoryin mobile ad hoc networks. Our approach is based on associating abstract atomic objects with certain geographiclocations. We assume the existence of focal points, geographic areas that are normally Â“populatedÂ” by mobile nodes.For example, a focal point may be a road junction, a scenic observation point, or a water resource in the desert. Mobilenodes that happen to populate a focal point participate in implementing a shared atomic object, using a replicated statemachine approach. These objects, which we call focal point objects, are then used to implement atomic read/writeoperations on a virtual shared object, using our new GeoQuorums algorithm. The GeoQuorums algorithm uses aquorum-based strategy in which each each quorum consists of a set of focal point objects. The quorums are used tomaintain the consistency of the shared memory and to tolerate limited failures of the focal point objects, caused bydepopulation of the corresponding geographic areas. We present a mechanism for changing the set of quorums onthe fly, thus improving efficiency. Overall, the new GeoQuorums algorithm efficiently implements read and writeoperations in a highly dynamic, mobile network.

MultiChord: A Resilient Namespace Management Protocol

2004年2月19日 00:00:00 GMT

MultiChord: A Resilient Namespace Management Protocol Lynch, Nancy; Stoica, Ion MultiChord is a new variant of the Chord namespace management algorithm [7] that includes lightweight mechanismsfor accommodating a limited rate of change, specifically, process joins and failures. This paper describes thealgorithm formally and evaluates its performance, using both simulation and analysis. Our main result is that lookupsare provably correctÂ—that is, each lookup returns results that are consistent with a hypothetical ideal system that differsfrom the actual system only in entries corresponding to recent joins and failuresÂ—in the presence of a limited rateof change. In particular, if the number of joins and failures that occur during a given time interval in a given regionof system are bounded, then all lookups are correct. A second result is a guaranteed upper bound for the latency of alookup operation in the absence of any other lookups in the system. Finally, we establish a relationship between thedeterministic assumptions of bounded joins and failures and the probabilistic assumptions (which are often used tomodel large scale networks). In particular, we derive a lower bound for the mean time between two violations of thedeterministic assumptions in a steady state system where joins and failures are modeled by Poisson processes.

New Architectural Models for Visibly Controllable Computing: The Relevance of Dynamic Object Oriented Architecturesand Plan Based Computing Models

2004年2月09日 00:00:00 GMT

New Architectural Models for Visibly Controllable Computing: The Relevance of Dynamic Object Oriented Architecturesand Plan Based Computing Models Shrobe, Howard; Laddaga, Robert Traditionally, we've focussed on the question of how to make a system easy to code the first time, or perhaps on how to ease the system's continued evolution. But if we look at life cycle costs, then we must conclude that the important question is how to make a system easy to operate. To do this we need to make it easy for the operators to see what's going on and to then manipulate the system so that it does what it is supposed to. This is a radically different criterion for success.What makes a computer system visible and controllable? This is a difficult question, but it's clear that today's modern operating systems with nearly 50 million source lines of code are neither. Strikingly, the MIT Lisp Machine and its commercial successors provided almost the same functionality as today's mainstream sytsems, but with only 1 Million lines of code. This paper is a retrospective examination of the features of the Lisp Machine hardware and software system. Our key claim is that by building the Object Abstraction into the lowest tiers of the system, great synergy and clarity were obtained.It is our hope that this is a lesson that can impact tomorrow's designs. We also speculate on how the spirit of the Lisp Machine could be extended to include a comprehensive access control model and how new layers of abstraction could further enrich this model.

Enhancing Availability and Security Through Failure-Oblivious Computing

2004年2月06日 00:00:00 GMT

Enhancing Availability and Security Through Failure-Oblivious Computing Rinard, Martin; Cadar, Cristian; Dumitran, Daniel; Roy, Daniel M.; Jr., William S. Beebee We present a new technique, failure-oblivious computing,that enables programs to continue to execute through memoryerrors without memory corruption. Our safe compilerfor C inserts checks that dynamically detect invalid memoryaccesses. Instead of terminating the execution or throwingan exception, the generated code simply discards invalidwrites and manufactures values to return for invalid reads,enabling the program to continue its normal execution.We have applied failure-oblivious computing to a set ofwidely-used programs that are part of the Linux-based opensourceinteractive computing environment. Our results showthat our techniques 1) make these programs invulnerableto known security attacks that exploit memory errors, and2) enable the programs to continue to operate successfullyto service legitimate requests and satisfy the needs of theirusers even after attacks trigger their memory errors.

Virtual Visual Hulls: Example-Based 3D Shape Estimation from a Single Silhouette

2004年1月28日 00:00:00 GMT

Virtual Visual Hulls: Example-Based 3D Shape Estimation from a Single Silhouette Grauman, Kristen; Shakhnarovich, Gregory; Darrell, Trevor Recovering a volumetric model of a person, car, or other objectof interest from a single snapshot would be useful for many computergraphics applications. 3D model estimation in general is hard, andcurrently requires active sensors, multiple views, or integration overtime. For a known object class, however, 3D shape can be successfullyinferred from a single snapshot. We present a method for generating a``virtual visual hull''-- an estimate of the 3D shape of an objectfrom a known class, given a single silhouette observed from an unknownviewpoint. For a given class, a large database of multi-viewsilhouette examples from calibrated, though possibly varied, camerarigs are collected. To infer a novel single view input silhouette'svirtual visual hull, we search for 3D shapes in the database which aremost consistent with the observed contour. The input is matched tocomponent single views of the multi-view training examples. A set ofviewpoint-aligned virtual views are generated from the visual hullscorresponding to these examples. The 3D shape estimate for the inputis then found by interpolating between the contours of these alignedviews. When the underlying shape is ambiguous given a single viewsilhouette, we produce multiple visual hull hypotheses; if a sequenceof input images is available, a dynamic programming approach isapplied to find the maximum likelihood path through the feasiblehypotheses over time. We show results of our algorithm on real andsynthetic images of people.

Selecting Relevant Genes with a Spectral Approach

2004年1月27日 00:00:00 GMT

Selecting Relevant Genes with a Spectral Approach Wolf, Lior; Shashua, Amnon; Mukherjee, Sayan Array technologies have made it possible to record simultaneouslythe expression pattern of thousands of genes. A fundamental problemin the analysis of gene expression data is the identification ofhighly relevant genes that either discriminate between phenotypiclabels or are important with respect to the cellular process studied inthe experiment: for example cell cycle or heat shock in yeast experiments,chemical or genetic perturbations of mammalian cell lines,and genes involved in class discovery for human tumors. In this paperwe focus on the task of unsupervised gene selection. The problemof selecting a small subset of genes is particularly challengingas the datasets involved are typically characterized by a very smallsample size Â— in the order of few tens of tissue samples Â— andby a very large feature space as the number of genes tend to bein the high thousands. We propose a model independent approachwhich scores candidate gene selections using spectral properties ofthe candidate affinity matrix. The algorithm is very straightforwardto implement yet contains a number of remarkable properties whichguarantee consistent sparse selections. To illustrate the value of ourapproach we applied our algorithm on five different datasets. Thefirst consists of time course data from four well studied Hematopoieticcell lines (HL-60, Jurkat, NB4, and U937). The other fourdatasets include three well studied treatment outcomes (large celllymphoma, childhood medulloblastomas, breast tumors) and oneunpublished dataset (lymph status). We compared our approachboth with other unsupervised methods (SOM,PCA,GS) and withsupervised methods (SNR,RMB,RFE). The results clearly showthat our approach considerably outperforms all the other unsupervisedapproaches in our study, is competitive with supervised methodsand in some case even outperforms supervised approaches.

Risk Bounds for Mixture Density Estimation

2004年1月27日 00:00:00 GMT

Risk Bounds for Mixture Density Estimation Rakhlin, Alexander; Panchenko, Dmitry; Mukherjee, Sayan In this paper we focus on the problem of estimating a boundeddensity using a finite combination of densities from a givenclass. We consider the Maximum Likelihood Procedure (MLE) and the greedy procedure described by Li and Barron. Approximation and estimation bounds are given for the above methods. We extend and improve upon the estimation results of Li and Barron, and in particular prove an $O(\frac{1}{\sqrt{n}})$ bound on the estimation error which does not depend on the number of densities in the estimated combination.

On the difficulty of feature-based attentional modulations in visual object recognition: A modeling study.

2004年1月14日 00:00:00 GMT

On the difficulty of feature-based attentional modulations in visual object recognition: A modeling study. Schneider, Robert; Riesenhuber, Maximilian Numerous psychophysical experiments have shown an important role for attentional modulations in vision. Behaviorally, allocation of attention can improve performance in object detection and recognition tasks. At the neural level, attention increases firing rates of neurons in visual cortex whose preferredstimulus is currently attended to. However, it is not yet known how these two phenomena are linked, i.e., how the visual system could be "tuned" in a task-dependent fashion to improve task performance. To answer this question, we performed simulations with the HMAX model of object recognition in cortex [45].We modulated firing rates of model neurons in accordance with experimental results about effects of feature-based attention on single neurons and measured changes in the model's performance in a variety of object recognition tasks. It turned out that recognition performance could only be improved under very limited circumstances and that attentional influences on the process of object recognition per se tend to display a lack of specificity or raise false alarm rates. These observations lead us to postulate a new role for the observed attention-related neural response modulations.

On Modular Pluggable Analyses Using Set Interfaces

2003年12月18日 00:00:00 GMT

On Modular Pluggable Analyses Using Set Interfaces Lam, Patrick; Kuncak, Viktor; Rinard, Martin We present a technique that enables the focused applicationof multiple analyses to different modules in the same program. Our researchhas two goals: 1) to address the scalability limitations of preciseanalyses by focusing the analysis on only those parts of the program thatare relevant to the properties that the analysis is designed to verify, and2) to enable the application of specialized analyses that verify propertiesof specifc classes of data structures to programs that simultaneouslymanipulate several dfferent kinds of data structures.In our approach, each module encapsulates a data structure and usesmembership in abstract sets to characterize how objects participate inits data structure. Each analysis verifies that the implementation of themodule 1) preserves important internal data structure representationinvariants and 2) conforms to a specification that uses formulas in a setalgebra to characterize the effects of operations on the data structure.The analyses use the common set abstraction to 1) characterize howobjects participate in multiple data structures and to 2) enable the interanalysiscommunication required to verify properties that depend onmultiple modules analyzed by different analyses.We characterize the key soundness property that an analysis plugin mustsatisfy to successfully participate in our system and present several analysisplugins that satisfy this property: a flag plugin that analyzes modulesin which abstract set membership is determined by a flag field in eachobject, and a graph types plugin that analyzes modules in which abstractset membership is determined by reachability properties of objects storedin tree-like data structures.

Rosebud: A Scalable Byzantine-Fault-Tolerant Storage Architecture

2003年12月17日 00:00:00 GMT

Rosebud: A Scalable Byzantine-Fault-Tolerant Storage Architecture Rodrigues, Rodrigo; Liskov, Barbara This paper presents Rosebud, a new Byzantine faulttolerantstorage architecture designed to be highly scalableand deployable in the wide-area. To support massiveamounts of data, we need to partition the data among thenodes. To support long-lived operation, we need to allowthe set of nodes in the system to change. To our knowledge,we are the first to present a complete design and arunning implementation of Byzantine-fault-tolerant storagealgorithms for a large scale, dynamic membership.We deployed Rosebud in a wide area testbed and ran experimentsto evaluate its performance, and our experimentsshow that it performs well. We show that our storage algorithmsperform equivalently to highly optimized replicationalgorithms in the wide-area. We also show that performancedegradation is minor when the system reconfigures.

RamboNodes for the Metropolitan Ad Hoc Network

2003年12月17日 00:00:00 GMT

RamboNodes for the Metropolitan Ad Hoc Network Beal, Jacob; Gilbert, Seth We present an algorithm to store data robustly in a large, geographically distributed network by means of localized regions of data storage that move in response to changing conditions. For example, data might migrate away from failures or toward regions of high demand. The PersistentNode algorithm provides this service robustly, but with limited safety guarantees. We use the RAMBO framework to transform PersistentNode into RamboNode, an algorithm that guarantees atomic consistency in exchange for increased cost and decreased liveness. In addition, a half-life analysis of RamboNode shows that it is robust against continuous low-rate failures. Finally, we provide experimental simulations for the algorithm on 2000 nodes, demonstrating how it services requests and examining how it responds to failures.

Fast Contour Matching Using Approximate Earth Mover's Distance

2003年12月05日 00:00:00 GMT

Fast Contour Matching Using Approximate Earth Mover's Distance Grauman, Kristen; Darrell, Trevor Weighted graph matching is a good way to align a pair of shapesrepresented by a set of descriptive local features; the set ofcorrespondences produced by the minimum cost of matching features fromone shape to the features of the other often reveals how similar thetwo shapes are. However, due to the complexity of computing the exactminimum cost matching, previous algorithms could only run efficientlywhen using a limited number of features per shape, and could not scaleto perform retrievals from large databases. We present a contourmatching algorithm that quickly computes the minimum weight matchingbetween sets of descriptive local features using a recently introducedlow-distortion embedding of the Earth Mover's Distance (EMD) into anormed space. Given a novel embedded contour, the nearest neighborsin a database of embedded contours are retrieved in sublinear time viaapproximate nearest neighbors search. We demonstrate our shapematching method on databases of 10,000 images of human figures and60,000 images of handwritten digits.

Mobilized ad-hoc networks: A reinforcement learning approach

2003年12月04日 00:00:00 GMT

Mobilized ad-hoc networks: A reinforcement learning approach Chang, Yu-Han; Ho, Tracey; Kaelbling, Leslie Pack Research in mobile ad-hoc networks has focused on situations in whichnodes have no control over their movements. We investigate animportant but overlooked domain in which nodes do have controlover their movements. Reinforcement learning methods can be used tocontrol both packet routing decisions and node mobility, dramaticallyimproving the connectivity of the network. We first motivate theproblem by presenting theoretical bounds for the connectivityimprovement of partially mobile networks and then present superiorempirical results under a variety of different scenarios in which themobile nodes in our ad-hoc network are embedded with adaptive routingpolicies and learned movement policies.

Component based recognition of objects in an office environment

2003年11月28日 00:00:00 GMT

Component based recognition of objects in an office environment Morgenstern, Christian; Heisele, Bernd We present a component-based approach for recognizing objectsunder large pose changes. From a set of training images of a givenobject we extract a large number of components which are clusteredbased on the similarity of their image features and their locations withinthe object image. The cluster centers build an initial set of componenttemplates from which we select a subset for the final recognizer.In experiments we evaluate different sizes and types of components andthree standard techniques for component selection. The component classifiersare finally compared to global classifiers on a database of fourobjects.

Finding Longest Increasing and Common Subsequences in Streaming Data

2003年11月26日 00:00:00 GMT

Finding Longest Increasing and Common Subsequences in Streaming Data Liben-Nowell, David; Vee, Erik; Zhu, An In this paper, we present algorithms and lower bounds for the Longest Increasing Subsequence(LIS) and Longest Common Subsequence (LCS) problems in the data streaming model.

The Satisfiability Threshold of Random 3-SAT Is at Least 3.52

2003年11月20日 00:00:00 GMT

The Satisfiability Threshold of Random 3-SAT Is at Least 3.52 Hajiaghayi, MohammadTaghi; Sorkin, Gregory B. We prove that a random 3-SAT instance with clause-to-variable densityless than 3.52 is satisfiable with high probability.The proof comes through an algorithm which selects (and sets) a variabledepending on its degree and that of its complement.

Efficient Specification-Assisted Error Localization and Correction

2003年11月13日 00:00:00 GMT

Efficient Specification-Assisted Error Localization and Correction Demsky, Brian; Cadar, Cristian; Roy, Daniel; Rinard, Martin We present a new error localization tool, Archie, that accepts aspecification of key data structure consistency constraints, then generatesan algorithm that checks if the data structures satisfy theconstraints. We also present a set of specification analyses and optimizationsthat (for our benchmark software system) improve theperformance of the generated checking algorithm by over a factorof 3,900 as compared with the initial interpreted implementation,enabling Archie to efficiently support interactive debugging.We evaluate ArchieÂ’s effectiveness by observing the actions oftwo developer populations (one using Archie, the other using standarderror localization techniques) as they attempted to localize andcorrect three errors in a benchmark software system. With Archie,the developers were able to localize each error in less than 10 minutesand correct each error in (usually much) less than 20 minutes.Without Archie, the developers were, with one exception, unableto locate each error after more than an hour of effort. These resultsillustrate ArchieÂ’s potential to substantially improve current errorlocalization and correction techniques.

Scalable Internet Routing on Topology-Independent Node Identities

2003年10月31日 00:00:00 GMT

Scalable Internet Routing on Topology-Independent Node Identities Ford, Bryan Unmanaged Internet Protocol (UIP) is a fully selforganizingnetwork-layer protocol that implements scalableidentity-based routing. In contrast with addressbasedrouting protocols, which depend for scalability oncentralized hierarchical address management, UIP nodesuse a flat namespace of cryptographic node identifiers.Node identities can be created locally on demand andremain stable across network changes. Unlike locationindependentname services, the UIP routing protocol canstitch together many conventional address-based networkswith disjoint or discontinuous address domains, providingconnectivity between any pair of participating nodes evenwhen no underlying network provides direct connectivity.The UIP routing protocol works on networks with arbitrarytopologies and global traffic patterns, and requiresonlyO(log N) storage per node for routing state, enablingeven small, ubiquitous edge devices to act as ad-hoc selfconfiguringrouters. The protocol rapidly recovers fromnetwork partitions, bringing every node up-to-date in amulticast-based chain reaction of O(log N) depth. Simulationresults indicate that UIP finds routes that are onaverage within 2X the length of the best possible route.

Evolving Robocode Tank Fighters

2003年10月28日 00:00:00 GMT

Evolving Robocode Tank Fighters Eisenstein, Jacob In this paper, I describe the application of genetic programming to evolve a controller for a robotic tank in a simulated environment.The purpose is to explore how genetic techniques can best be applied to produce controllers based on subsumption and behavior oriented languages such as REX. As part of my implementation, I developed TableRex, a modification of REX that can be expressed on a fixed-lengthgenome. Using a fixed subsumption architecture of TableRex modules, I evolved robots that beat some of the most competitive hand-coded adversaries.

On Role Logic

2003年10月24日 00:00:00 GMT

On Role Logic Kuncak, Viktor; Rinard, Martin We present role logic, a notation for describing propertiesof relational structures in shape analysis, databases, andknowledge bases. We construct role logic using the ideas ofde Bruijn's notation for lambda calculus, an encoding offirst-order logic in lambda calculus, and a simple rule forimplicit arguments of unary and binary predicates.The unrestricted version of role logic has the expressivepower of first-order logic with transitive closure. Using asyntactic restriction on role logic formulas, we identify anatural fragment RL^2 of role logic. We show that the RL^2fragment has the same expressive power as two-variable logicwith counting C^2 and is therefore decidable.We present a translation of an imperative language into thedecidable fragment RL^2, which allows compositionalverification of programs that manipulate relationalstructures. In addition, we show how RL^2 encodes booleanshape analysis constraints and an expressive descriptionlogic.

A Stream Algorithm for the SVD

2003年10月22日 00:00:00 GMT

A Stream Algorithm for the SVD Strumpen, Volker; Hoffmann, Henry; Agarwal, Anant We present a stream algorithm for the Singular-Value Decomposition (SVD) of anM X N matrix A. Our algorithm trades speed of numerical convergence for parallelism,and derives from a one-sided, cyclic-by-rows Hestenes SVD. Experimental results showthat we can create O(M) parallelism, at the expense of increasing the computationalwork by less than a factor of about 2. Our algorithm qualifes as a stream algorithmin that it requires no more than a small, bounded amount of local storage per processor and its compute efficiency approaches an optimal 100% asymptotically for largenumbers of processors and appropriate problem sizes.

Updatable Zero-Knowledge Sets

2003年10月14日 00:00:00 GMT

Updatable Zero-Knowledge Sets Liskov, Moses; Milcali, Silvio We build on the work of Micali, Rabin, and Killian [4] to introduce zero-knowledge sets and databases that may be updated in a desirable way. In particular, in order to make an update the owner of the set must publish a commitment to the update, and update the commitment to the set. The update should take time independent of the size of the set. In addition, the update should not leak which key was added (or removed), or what data is associated with that key. Furthermore, our update will be transparent in that those already possessing a proof of a particular key being present or absent should be able to update their proofs to obtain a valid proof relative to the updated set, except if their proof is relative to the element that was changed.

A Correctness Proof for a Byzantine-Fault-Tolerant Read/Write Atomic Memory with Dynamic Replica Membership

2003年9月25日 00:00:00 GMT

A Correctness Proof for a Byzantine-Fault-Tolerant Read/Write Atomic Memory with Dynamic Replica Membership Rodrigues, Rodrigo; Liskov, Barbara We prove correctness of a Byzantine-fault-tolerant replication algorithm for a read/writeatomic memory that supports a dynamic replica set.

Investigating shape representation in area V4 with HMAX: Orientation and Grating selectivities

2003年9月08日 00:00:00 GMT

Investigating shape representation in area V4 with HMAX: Orientation and Grating selectivities Kouh, Minjoon; Riesenhuber, Maximilian The question of how shape is represented is of central interest to understanding visual processing in cortex. While tuning properties of the cells in early part of the ventral visual stream, thought to be responsible for object recognition in the primate, are comparatively well understood, several different theories have been proposed regarding tuning in higher visual areas, such as V4. We used the model of object recognition in cortex presented by Riesenhuber and Poggio (1999), where more complex shape tuning in higher layers is the result of combining afferent inputs tuned to simpler features, and compared the tuning properties of model units in intermediate layers to those of V4 neurons from the literature. In particular, we investigated the issue of shape representation in visual area V1 and V4 using oriented bars and various types of gratings (polar, hyperbolic, and Cartesian), as used in several physiology experiments. Our computational model was able to reproduce several physiological findings, such as the broadening distribution of the orientation bandwidths and the emergence of a bias toward non-Cartesian stimuli. Interestingly, the simulation results suggest that some V4 neurons receive input from afferents with spatially separated receptive fields, leading to experimentally testable predictions. However, the simulations also show that the stimulus set of Cartesian and non-Cartesian gratings is not sufficiently complex to probe shape tuning in higher areas, necessitating the use of more complex stimulus sets.

Exploiting Vector Parallelism in Software Pipelined Loops

2005年6月03日 00:00:00 GMT

Exploiting Vector Parallelism in Software Pipelined Loops Larsen, Sam; Rabbah, Rodric; Amarasinghe, Saman An emerging trend in processor design is the incorporation of short vector instructions into the ISA. In fact, vector extensions have appeared in most general-purpose microprocessors. To utilize these instructions, traditional vectorization technology can be used to identify and exploit data parallelism. In contrast, efficient use of a processor\'s scalar resources is typically achieved through ILP techniques such as software pipelining. In order to attain the best performance, it is necessary to utilize both sets of resources. This paper presents a novel approach for exploiting vector parallelism in a software pipelined loop. At its core is a method for judiciously partitioning operations between vector and scalar resources. The proposed algorithm (i) lowers the burden on the scalar resources by offloading computation to the vector functional units, and (ii) partially (or fully) inhibits the optimizations when full vectorization will decrease performance. ! This results in better resource usage and allows for software pipelining with shorter initiation intervals. Although our techniques complement statically scheduled machines most naturally, we believe they are applicable to any architecture that tightly integrates support for ILP and data parallelism.An important aspect of the proposed methodology is its ability to manage explicit communication of operands between vector and scalar instructions. Our methodology also allows for a natural handling of misaligned vector memory operations. For architectures that provide hardware support for misaligned references, software pipelining effectively hides the latency of these potentially expensive instructions. When explicit alignment is required in software, our algorithm accounts for these extra costs and vectorizes only when it is profitable. Finally, our heuristic can take advantage of alignment information where it is available.We evaluate our methodology using several DSP and SPEC FP benchmarks. Compared to software pipelining, our approach is able to achieve an average speedup of 1.30x and 1.18x for the two benchmark sets, respectively.

Dynamic Input/Output Automata: A Formal Model for Dynamic Systems

2003年7月26日 00:00:00 GMT

Dynamic Input/Output Automata: A Formal Model for Dynamic Systems Attie, Paul C.; Lynch, Nancy A. We present a mathematical state-machine model, the Dynamic I/O Automaton (DIOA) model, for defining and analyzing dynamic systems of interacting components. The systems we consider are dynamic in two senses: (1) components can be created and destroyed as computation proceeds, and (2) the events in which the components may participate may change. The new model admits a notion of external system behavior, based on sets of traces. It also features a parallel composition operator for dynamic systems, which respects external behavior, and a notion of simulation from one dynamic system to another, which can be used to prove that one system implements the other.

On Our Experience with Modular Pluggable Analyses

2004年10月04日 00:00:00 GMT

On Our Experience with Modular Pluggable Analyses Lam, Patrick; Kuncak, Viktor; Rinard, Martin We present a technique that enables the focused applicationof multiple analyses to di erent modules in thesame program. In our approach, each module encapsulatesone or more data structures and uses membershipin abstract sets to characterize how objects participatein data structures. Each analysis veri es that the implementationof the module 1) preserves important internaldata structure consistency properties and 2) correctlyimplements an interface that uses formulas in a set algebrato characterize the e ects of operations on theencapsulated data structures. Collectively, the analysesuse the set algebra to 1) characterize how objects participatein multiple data structures and to 2) enable theinter-analysis communication required to verify propertiesthat depend on multiple modules analyzed by differentanalyses.We have implemented our system and deployed threepluggable analyses into it: a ag analysis for modulesin which abstract set membership is determined by aag eld in each object, a plugin for modules that encapsulatelinked data structures such as lists and trees,and an array plugin in which abstract set membershipis determined by membership in an array. Our experimentalresults indicate that our approach makes it possibleto e ectively combine multiple analyses to verifyproperties that involve objects shared by multiple modules,with each analysis analyzing only those modulesfor which it is appropriate.

Pyramid Match Kernels: Discriminative Classification with Sets of Image Features

2005年3月17日 00:00:00 GMT

Pyramid Match Kernels: Discriminative Classification with Sets of Image Features Grauman, Kristen; Darrell, Trevor Discriminative learning is challenging when examples are setsof local image features, and the sets vary in cardinality and lackany sort of meaningful ordering. Kernel-based classificationmethods can learn complex decision boundaries, but a kernelsimilarity measure for unordered set inputs must somehow solve forcorrespondences -- generally a computationally expensive task thatbecomes impractical for large set sizes. We present a new fastkernel function which maps unordered feature sets tomulti-resolution histograms and computes a weighted histogramintersection in this space. This ``pyramid match" computation islinear in the number of features, and it implicitly findscorrespondences based on the finest resolution histogram cell wherea matched pair first appears. Since the kernel does not penalize thepresence of extra features, it is robust to clutter. We show thekernel function is positive-definite, making it valid for use inlearning algorithms whose optimal solutions are guaranteed only forMercer kernels. We demonstrate our algorithm on object recognitiontasks and show it to be dramatically faster than currentapproaches.

Systematic Conformational Search with Constraint Satisfaction

2004年10月01日 00:00:00 GMT

Systematic Conformational Search with Constraint Satisfaction Tucker-Kellogg, Lisa Throughout biological, chemical, and pharmaceutical research,conformational searches are used to explore the possiblethree-dimensional configurations of molecules. This thesis describesa new systematic method for conformational search, including anapplication of the method to determining the structure of a peptidevia solid-state NMR spectroscopy. A separate portion of the thesis isabout protein-DNA binding, with a three-dimensional macromolecularstructure determined by x-ray crystallography.The search method in this thesis enumerates all conformations of amolecule (at a given level of torsion angle resolution) that satisfy aset of local geometric constraints, such as constraints derived fromNMR experiments. Systematic searches, historically used for smallmolecules, generally now use some form of divide-and-conquer forapplication to larger molecules. Our method can achieve a significantimprovement in runtime by making some major and counter-intuitivemodifications to traditional divide-and-conquer:(1) OmniMerge divides a polymer into many alternative pairs ofsubchains and searches all the pairs, instead of simply cutting inhalf and searching two subchains. Although the extra searches mayappear wasteful, the bottleneck stage of the overall search, which isto re-connect the conformations of the largest subchains, can be greatlyaccelerated by the availability of alternative pairs of sidechains.(2) Propagation of disqualified conformations acrossoverlapping subchains can disqualify infeasible conformations veryrapidly, which further offsets the cost of searching the extrasubchains of OmniMerge.(3) The search may be run in two stages, once at low-resolutionusing a side-effect of OmniMerge to determine an optimalpartitioning of the molecule into efficient subchains; then again athigh-resolution while making use of the precomputed subchains.(4) An A* function prioritizes each subchain based onestimated future search costs. Subchains with sufficiently lowpriority can be omitted from the search, which improves efficiency.A common theme of these four ideas is to make good choices about howto break the large search problem into lower-dimensional subproblems.In addition, the search method uses heuristic local searches withinthe overall systematic framework, to maintain the systematic guaranteewhile providing the empirical efficiency of stochastic search.These novel algorithms were implemented and the effectiveness of eachinnovation is demonstrated on a highly constrained peptide with 40degrees of freedom.

Automatic Software Upgrades for Distributed Systems (PhD thesis)

2005年10月06日 00:00:00 GMT

Automatic Software Upgrades for Distributed Systems (PhD thesis) Ajmani, Sameer Upgrading the software of long-lived, highly-available distributedsystems is difficult. It is not possible to upgrade all the nodes in asystem at once, since some nodes may be unavailable and halting thesystem for an upgrade is unacceptable. Instead, upgrades may happengradually, and there may be long periods of time when different nodesare running different software versions and need to communicate usingincompatible protocols. We present a methodology and infrastructurethat address these challenges and make it possible to upgradedistributed systems automatically while limiting service disruption.Our methodology defines how to enable nodes to interoperate acrossversions, how to preserve the state of a system across upgrades, and howto schedule an upgrade so as to limit service disruption. The approachis modular: defining an upgrade requires understanding only the newsoftware and the version it replaces.The upgrade infrastructure is a generic platform for distributing andinstalling software while enabling nodes to interoperate acrossversions. The infrastructure requires no access to the system sourcecode and is transparent: node software is unaware that differentversions even exist. We have implemented a prototype of theinfrastructure called Upstart that intercepts socket communication usinga dynamically-linked C++ library. Experiments show that Upstart has lowoverhead and works well for both local-area and Internet systems.

Selectivity of Local Field Potentials in Macaque Inferior Temporal Cortex

2004年9月21日 00:00:00 GMT

Selectivity of Local Field Potentials in Macaque Inferior Temporal Cortex Kreiman, Gabriel; Hung, Chou; Poggio, Tomaso; DiCarlo, James While single neurons in inferior temporal (IT) cortex show differential responses to distinct complex stimuli, little is known about the responses of populations of neurons in IT. We recorded single electrode data, including multi-unit activity (MUA) and local field potentials (LFP), from 618 sites in the inferior temporal cortex of macaque monkeys while the animals passively viewed 78 different pictures of complex stimuli. The LFPs were obtained by low-pass filtering the extracellular electrophysiological signal with a corner frequency of 300 Hz. As reported previously, we observed that spike counts from MUA showed selectivity for some of the pictures. Strikingly, the LFP data, which is thought to constitute an average over large numbers of neurons, also showed significantly selective responses. The LFP responses were less selective than the MUA responses both in terms of the proportion of selective sites as well as in the selectivity of each site. We observed that there was only little overlap between the selectivity of MUA and LFP recordings from the same electrode. To assess the spatial organization of selective responses, we compared the selectivity of nearby sites recorded along the same penetration and sites recorded from different penetrations. We observed that MUA selectivity was correlated on spatial scales up to 800 m while the LFP selectivity was correlated over a larger spatial extent, with significant correlations between sites separated by several mm. Our data support the idea that there is some topographical arrangement to the organization of selectivity in inferior temporal cortex and that this organization may be relevant for the representation of object identity in IT.

The Interval Programming Model for Multi-objective Decision Making

2004年9月27日 00:00:00 GMT

The Interval Programming Model for Multi-objective Decision Making Benjamin, Michael R. The interval programming model (IvP) is a mathematical programmingmodel for representing and solving multi-objective optimizationproblems. The central characteristic of the model is the use ofpiecewise linearly defined objective functions and a solution methodthat searches through the combination space of pieces rather thanthrough the actual decision space. The piecewise functions typicallyrepresent an approximation of some underlying function, but thisconcession is balanced on the positive side by relative freedom fromfunction form assumptions as well as the assurance of global optimality.In this paper the model and solution algorithms are described, and theapplicability of IvP to certain applications arediscussed.

The Quorum Deployment Problem

2004年10月29日 00:00:00 GMT

The Quorum Deployment Problem Gilbert, Seth; Malewicz, Grzegorz Quorum systems are commonly used to maintain the consistency of replicated data in adistributed system. Much research has been devoted to developing quorum systems with good theoreticalproperties, such as fault tolerance and high availability. However, even given a theoreticallygood quorum system, it is not obvious how to efficiently deploy such a system in a real network. Thispaper introduces a new combinatorial optimization problem, the Quorum Deployment Problem, andstudies its complexity. We demonstrate that it is NP-hard to approximate the Quorum DeploymentProblem within any factor of n?, where n is the number of nodes in the distributed network and ? > 0.The problem is NP-hard in even the simplest possible distributed network: a one-dimensional line withmetric cost. We begin to study algorithms for variants of the problem. Some variants can be solved optimallyin polynomial time and some NP-hard variants can be approximated to within a constant factor.

Eclat: Automatic Generation and Classification of Test Inputs

2004年10月14日 00:00:00 GMT

Eclat: Automatic Generation and Classification of Test Inputs Pacheo, Carlos; Ernst, Michael D. This paper describes a technique that helps a test engineerselect, from a large set of randomly generated testinputs, a small subset likely to reveal faults in the softwareunder test. The technique takes a program or software component,plus a set of normal executionsÂ—say, from an existingtest suite, or from observations of the software runningproperly. The technique works by extracting an operationalmodel of the softwareÂ’s operation, and comparingeach inputÂ’s operational pattern of execution against themodel. Test inputs whose operational pattern is suggestiveof a fault are further reduced by selecting only one inputper such pattern. The result is a small portion of the originalinputs, deemed most likely to reveal faults. Thus, ourtechnique can also be seen as an error-detection technique.We have implemented these ideas in the Eclat tool, designedfor unit testing of Java classes. Eclat generates alarge number of inputs and uses our technique to select onlya few of them as fault-revealing. The inputs that it selectsare an order of magnitude more likely to reveal faults thannon-selected inputs.

The Architecture of MAITA: A Tool for Monitoring, Analysis, and Interpretation

2004年5月18日 00:00:00 GMT

The Architecture of MAITA: A Tool for Monitoring, Analysis, and Interpretation Jon, Doyle; Kohane, Isaac; Long, William; Szolovits, Peter This report describes the aims, functions, and organization of the MAITAsystem for knowledge-based construction, adaptation, and control of networks of monitoringprocesses.

Implementing Asynchronous Distributed Systems Using the IOA Toolkit

2004年10月06日 00:00:00 GMT

Implementing Asynchronous Distributed Systems Using the IOA Toolkit Georgiou, Chryssis; Mavrommatis, Panayiotis P.; Tauber, Joshua A. This document is a report about the capabilities and performance of the IOA Toolkit, and in particularthe tools that provide support for implementing and running distributed systems (checker,composer, code generator). The Toolkit compiles distributed systems specified in IOA into Javaclasses, which run on a network of workstations and communicate using the Message Passing Interface(MPI). In order to test the toolkit, several distributed algorithms were implemented, rangingfrom simple algorithms such as LCR leader election in a ring network to more complex algorithmssuch as the GHS algorithm for computing the minimum spanning tree in an arbitrary graph. Allof our experiments completed successfully, and several runtime measurements were made.

Predictive identification of alternative events conserved in human and mouse

2004年9月30日 00:00:00 GMT

Predictive identification of alternative events conserved in human and mouse Yeo, Gene; Van Nostrand, Eric; Holste, Dirk; Poggio, Tomaso; Burge, Christopher Alternative pre-messenger RNA splicing affects a majority of human genes and plays important roles in development and disease. Alternative splicing (AS) events conserved since the divergence of human and mouse are likely of primary biological importance, but relatively few such events are known. Here we describe sequence features that distinguish exons subject to evolutionarily conserved AS, which we call 'alternative-conserved exons' (ACEs) from other orthologous human/mouse exons, and integrate these features into an exon classification algorithm, ACEScan. Genome-wide analysis of annotated orthologous human-mouse exon pairs identified ~2,000 predicted ACEs. Alternative splicing was verified in both human and mouse tissues using an RT-PCR-sequencing protocol for 21 of 30 (70%) predicted ACEs tested, supporting the validity of a majority of ACEScan predictions. By contrast, AS was observed in mouse tissues for only 2 of 15 (13%) tested exons which had EST or cDNA evidence of AS in human but were not predicted ACEs, and was never observed for eleven negative control exons in human or mouse tissues. Predicted ACEs were much more likely to preserve reading frame, and less likely to disrupt protein domains than other AS events, and were enriched in genes expressed in the brain and in genes involved in transcriptional regulation, RNA processing and development. Our results also imply that the vast majority of AS events represented in the human EST databases are not conserved in mouse, and therefore may represent aberrant, disease- or allele-specific, or highly lineage-restricted splicing events.

A Reliable Broadcast Scheme for Sensor Networks

2003年8月11日 00:00:00 GMT

A Reliable Broadcast Scheme for Sensor Networks Livadas, Carolos; Lynch, Nancy A. In this short technical report, we present a simple yet effective reliable broadcast protocol for sensor networks. This protocol disseminates packets throughout the sensor network by flooding and recovers from losses resulting from collisions by having hosts retransmit packets whenever they notice that their neighbors have fallen behind. Such retransmissions serve to flood the appropriate packets throughout the regions of the sensor network that did not receive the given packets as a result of prior flooding attempts.

On The Boolean Algebra of Shape Analysis Constraints

2003年8月22日 00:00:00 GMT

On The Boolean Algebra of Shape Analysis Constraints Kuncak, Viktor; Rinard, Martin Shape analysis is a promising technique for statically verifyingand extracting properties of programs that manipulatecomplex data structures. We introduce a new characterizationof constraints that arise in parametric shapeanalysis based on manipulation of three-valued structuresas dataflow facts.We identify an interesting syntactic class of first-orderlogic formulas that captures the meaning of three-valuedstructures under concretization. This class is broader thanpreviously introduced classes, allowing for a greater flexibilityin the formulation of shape analysis constraints inprogram annotations and internal analysis representations.Three-valued structures can be viewed as one possible normalform of the formulas in our class.Moreover, we characterize the meaning of three-valuedstructures under Â“tight concretizationÂ”. We show that theseemingly minor change from concretization to tight concretizationincreases the expressive power of three-valuedstructures in such a way that the resulting constraints areclosed under all boolean operations. We call the resultingconstraints boolean shape analysis constraints.The main technical contribution of this paper is a naturalsyntactic characterization of boolean shape analysis constraintsas arbitrary boolean combinations of first-order sentencesof certain form, and an algorithm for transformingsuch boolean combinations into the normal form that correspondsdirectly to three-valued structures.Our result holds in the presence of arbitrary shape analysisinstrumentation predicates. The result enables the reduction(without any approximation) of the entailment andthe equivalence of shape analysis constraints to the satisfiabilityof shape analysis constraints. When the satisfiabilityof the constraints is decidable, our result implies that theentailment and the equivalence of the constraints are alsodecidable, which enables the use of constraints in a compositionalshape analysis with a predictable behavior.

Permutation Tests for Classification

2003年8月28日 00:00:00 GMT

Permutation Tests for Classification Mukherjee, Sayan; Golland, Polina; Panchenko, Dmitry We introduce and explore an approach to estimating statisticalsignificance of classification accuracy, which is particularly usefulin scientific applications of machine learning where highdimensionality of the data and the small number of training examplesrender most standard convergence bounds too loose to yield ameaningful guarantee of the generalization ability of theclassifier. Instead, we estimate statistical significance of theobserved classification accuracy, or the likelihood of observing suchaccuracy by chance due to spurious correlations of thehigh-dimensional data patterns with the class labels in the giventraining set. We adopt permutation testing, a non-parametric techniquepreviously developed in classical statistics for hypothesis testing inthe generative setting (i.e., comparing two probabilitydistributions). We demonstrate the method on real examples fromneuroimaging studies and DNA microarray analysis and suggest atheoretical analysis of the procedure that relates the asymptoticbehavior of the test to the existing convergence bounds.

The Theory of Timed I/O Automata

2005年3月02日 00:00:00 GMT

The Theory of Timed I/O Automata Kaynor, Dilsun K.; Lynch, Nancy; Segala, Roberto; Vaandrager, Frits This monograph presents the Timed Input/Output Automaton (TIOA) modeling framework, a basic mathematical framework to support description and analysis of timed systems.

Fluorescence Assay for Polymerase Arrival Rates

2003年8月31日 00:00:00 GMT

Fluorescence Assay for Polymerase Arrival Rates Che, Austin To engineer complex synthetic biological systems will require modulardesign, assembly, and characterization strategies. The RNApolymerase arrival rate (PAR) is defined to be the rate that RNApolymerases arrive at a specified location on the DNA. Designing andcharacterizing biological modules in terms of RNA polymerase arrivalrates provides for many advantages in the construction and modeling ofbiological systems.PARMESAN is an in vitro method for measuring polymerase arrival ratesusing pyrrolo-dC, a fluorescent DNA base that can substitute forcytosine. Pyrrolo-dC shows a detectable fluorescence difference whenin single-stranded versus double-stranded DNA. During transcription,RNA polymerase separates the two strands of DNA, leading to a changein the fluorescence of pyrrolo-dC. By incorporating pyrrolo-dC atspecific locations in the DNA, fluorescence changes can be taken as adirect measurement of the polymerase arrival rate.

Marriage, Honesty, and Stability

2003年7月28日 00:00:00 GMT

Marriage, Honesty, and Stability Immorlica, Nicole; Mahdian, Mohammad Many centralized two-sided markets form a matching between participantsby running a stable marriage algorithm. It is a well-knownfact that no matching mechanism based on a stable marriage algorithmcan guarantee truthfulness as a dominant strategy for participants.However, as we will show in this paper, in a probabilisticsetting where the preference lists of one side of the market are composedof only a constant (independent of the the size of the market)number of entries, each drawn from an arbitrary distribution, thenumber of participants that have more than one stable partner is vanishinglysmall. This proves (and generalizes) a conjecture of Rothand Peranson [23]. As a corollary of this result, we show that, withhigh probability, the truthful strategy is the best response for a givenplayer when the other players are truthful. We also analyze equilibriaof the deferred acceptance stable marriage game. We show thatthe game with complete information has an equilibrium in which a(1?o(1)) fraction of the strategies are truthful in expectation. In themore realistic setting of a game of incomplete information, we willshow that the set of truthful strategies form a (1+o(1))-approximateBayesian-Nash equilibrium. Our results have implications in manypractical settings and were inspired by the work of Roth and Peranson[23] on the National Residency Matching Program.

Near-Optimal Distributed Failure Circumscription

2003年8月11日 00:00:00 GMT

Near-Optimal Distributed Failure Circumscription Beal, Jacob Small failures should only disrupt a small part of a network. One wayto do this is by marking the surrounding area as untrustworthy ---circumscribing the failure. This can be done with a distributedalgorithm using hierarchical clustering and neighbor relations, andthe resulting circumscription is near-optimal for convex failures.

The Theory of Timed I/O Automata

2003年8月27日 00:00:00 GMT

The Theory of Timed I/O Automata Kaynar, Dilsun K.; Lynch, Nancy; Segala, Roberto; Vaandrager, Frits Revised version -- November 23, 2004.This paper presents the Timed Input/Output Automaton (TIOA) modeling framework, a basic mathematical framework to support description and analysis of timed systems.

Selecting Refining and Evaluating Properties for Program Analysis

2003年7月21日 00:00:00 GMT

Selecting Refining and Evaluating Properties for Program Analysis Dodoo, Nii; Lin, Lee; Ernst, Michael D. This research proposes and evaluates techniques for selectingpredicates for conditional program propertiesÂ—thatis, implications such as p ) q whose consequent must betrue whenever the predicate is true. Conditional propertiesare prevalent in recursive data structures, which behave differentlyin their base and recursive cases, in programs thatcontain branches, in programs that fail only on some inputs,and in many other situations. The experimental context ofthe research is dynamic detection of likely program invariants,but the ideas are applicable to other domains.Trying every possible predicate for conditional propertiesis computationally infeasible and yields too many undesirableproperties. This paper compares four policies forselecting predicates: procedure return analysis, code conditionals,clustering, and random selection. It also showshow to improve predicates via iterated analysis. An experimentalevaluation demonstrates that the techniques improveperformance on two tasks: statically proving the absence ofrun-time errors with a theorem-prover, and separating faultyfrom correct executions of erroneous programs.

Learning object segmentation from video data

2003年9月08日 00:00:00 GMT

Learning object segmentation from video data Ross, Michael G.; Kaelbling, Leslie Pack This memo describes the initial results of a project to create aself-supervised algorithm for learning object segmentation from videodata. Developmental psychology and computational experience havedemonstrated that the motion segmentation of objects is a simpler,more primitive process than the detection of object boundaries bystatic image cues. Therefore, motion information provides a plausiblesupervision signal for learning the static boundary detection task andfor evaluating performance on a test set. A video camera andpreviously developed background subtraction algorithms canautomatically produce a large database of motion-segmented images forminimal cost. The purpose of this work is to use the information insuch a database to learn how to detect the object boundaries in novelimages using static information, such as color, texture, and shape.This work was funded in part by the Office of Naval Research contract#N00014-00-1-0298, in part by the Singapore-MIT Alliance agreement of11/6/98, and in part by a National Science Foundation Graduate StudentFellowship.

Representation and Detection of Shapes in Images

2003年8月08日 00:00:00 GMT

Representation and Detection of Shapes in Images Felzenszwalb, Pedro F. We present a set of techniques that can be used to represent anddetect shapes in images. Our methods revolve around a particularshape representation based on the description of objects usingtriangulated polygons. This representation is similar to the medialaxis transform and has important properties from a computationalperspective. The first problem we consider is the detection ofnon-rigid objects in images using deformable models. We present anefficient algorithm to solve this problem in a wide range ofsituations, and show examples in both natural and medical images. Wealso consider the problem of learning an accurate non-rigid shapemodel for a class of objects from examples. We show how to learn goodmodels while constraining them to the form required by the detectionalgorithm. Finally, we consider the problem of low-level imagesegmentation and grouping. We describe a stochastic grammar thatgenerates arbitrary triangulated polygons while capturing Gestaltprinciples of shape regularity. This grammar is used as a prior modelover random shapes in a low level algorithm that detects objects inimages.

Sharing visual features for multiclass and multiview object detection

2004年4月14日 00:00:00 GMT

Sharing visual features for multiclass and multiview object detection Torralba, Antonio; Murphy, Kevin P.; Freeman, William T. We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (run-time) computational complexity, and the (training-time) sample complexity, scales linearly with the number of classes to be detected. It seems unlikely that such an approach will scale up to allow recognition of hundreds or thousands of objects.We present a multi-class boosting procedure (joint boosting) that reduces the computational and sample complexity, by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required, and therefore the computational cost, is observed to scale approximately logarithmically with the number of classes. The features selected jointly are closer to edges and generic features typical of many natural structures instead of finding specific object parts. Those generic features generalize better and reduce considerably the computational cost of an algorithm for multi-class object detection.

Dissociated Dipoles: Image representation via non-local comparisons

2003年8月13日 00:00:00 GMT

Dissociated Dipoles: Image representation via non-local comparisons Balas, Benjamin J.; Sinha, Pawan A fundamental question in visual neuroscience is how to represent image structure. The most common representational schemes rely on differential operators that compare adjacent image regions. While well-suited to encoding local relationships, such operators have significant drawbacks. Specifically, each filterÂ’s span is confounded with the size of its sub-fields, making it difficult to compare small regions across large distances. We find that such long-distance comparisons are more tolerant to common image transformations than purely local ones, suggesting they may provide a useful vocabulary for image encoding. .We introduce the Â“Dissociated Dipole,Â” or Â“SticksÂ” operator, for encoding non-local image relationships. This operator de-couples filter span from sub-field size, enabling parametric movement between edge and region-based representation modes. We report on the perceptual plausibility of the operator, and the computational advantages of non-local encoding. Our results suggest that non-local encoding may be an effective scheme for representing image structure.

Direction Estimation of Pedestrian from Images

2003年8月27日 00:00:00 GMT

Direction Estimation of Pedestrian from Images Shimizu, Hiroaki; Poggio, Tomaso The capability of estimating the walking direction of people would be useful in many applications such as those involving autonomous cars and robots.We introduce an approach for estimating the walking direction of people from images, based on learning the correct classification of a still image by using SVMs. We find that the performance of the system can be improved by classifying each image of a walking sequence and combining the outputs of the classifier.Experiments were performed to evaluate our system and estimate the trade-off between number of images in walking sequences and performance.

Secure Program Execution Via Dynamic Information Flow Tracking

2003年7月21日 00:00:00 GMT

Secure Program Execution Via Dynamic Information Flow Tracking Suh, G. Edward; Lee, Jaewook; Zhang, David; Devadas, Srinivas We present a simple architectural mechanism called dynamicinformation flow tracking that can significantly improve thesecurity of computing systems with negligible performanceoverhead. Dynamic information flow tracking protects programs against malicious software attacks by identifying spurious information flows from untrusted I/O and restrictingthe usage of the spurious information.Every security attack to take control of a program needsto transfer the programÂ’s control to malevolent code. Inour approach, the operating system identifies a set of inputchannels as spurious, and the processor tracks all information flows from those inputs. A broad range of attacks areeffectively defeated by checking the use of the spurious values as instructions and pointers.Our protection is transparent to users or application programmers; the executables can be used without any modification. Also, our scheme only incurs, on average, a memoryoverhead of 1.4% and a performance overhead of 1.1%.

New Algorithms for Load Balancing in Peer-to-Peer Systems

2003年7月16日 00:00:00 GMT

New Algorithms for Load Balancing in Peer-to-Peer Systems Karger, David; Ruhl, Matthias Load balancing is a critical issue for the efficient operation of peer-to-peer networks. We give new protocols for several scenarios, whose provable performance guarantees are within a constant factor of optimal. First, we give an improved version of consistent hashing, a scheme used for item to node assignments in the Chord system. In its original form, it required every network node to operate O(log n) virtual nodes to achieve a balanced load, causing a corresponding increase in space and bandwidth usage. Our protocol eliminates the necessity of virtual nodes while maintaining a balanced load. Improving on related protocols, our scheme allows for the deletion of nodes and admits a simpler analysis, since the assignments do not depend on the history of the network. We then analyze a simple protocol for load sharing by movements of data from higher loaded to lower loaded nodes. This protocol can be extended to preserve the ordering of data items. As an application, we use the last protocol to give an efficient implementation of a distributed data structure for range searches on ordered data.

Compact Representations for Fast Nonrigid Registration of Medical Images

2003年7月04日 00:00:00 GMT

Compact Representations for Fast Nonrigid Registration of Medical Images Timoner, Samson We develop efficient techniques for the non-rigid registration of medical images by using representations that adapt to the anatomy found in such images. Images of anatomical structures typically have uniform intensity interiors and smooth boundaries. We create methods to represent such regions compactly using tetrahedra. Unlike voxel-based representations, tetrahedra can accurately describe the expected smooth surfaces of medical objects. Furthermore, the interior of such objects can be represented using a small number of tetrahedra. Rather than describing a medical object using tens of thousands of voxels, our representations generally contain only a few thousand elements. Tetrahedra facilitate the creation of efficient non-rigid registration algorithms based on finite element methods (FEM). We create a fast, FEM-based method to non-rigidly register segmented anatomical structures from two subjects. Using our compact tetrahedral representations, this method generally requires less than one minute of processing time on a desktop PC. We also create a novel method for the non-rigid registration of gray scale images. To facilitate a fast method, we create a tetrahedral representation of a displacement field that automatically adapts to both the anatomy in an image and to the displacement field. The resulting algorithm has a computational cost that is dominated by the number of nodes in the mesh (about 10,000), rather than the number of voxels in an image (nearly 10,000,000). For many non-rigid registration problems, we can find a transformation from one image to another in five minutes. This speed is important as it allows use of the algorithm during surgery. We apply our algorithms to find correlations between the shape of anatomical structures and the presence of schizophrenia. We show that a study based on our representations outperforms studies based on other representations. We also use the results of our non-rigid registration algorithm as the basis of a segmentation algorithm. That algorithm also outperforms other methods in our tests, producing smoother segmentations and more accurately reproducing manual segmentations.

On the Max-Flow Min-Cut Ratio for Directed Multicommodity Flows

2003年7月05日 00:00:00 GMT

On the Max-Flow Min-Cut Ratio for Directed Multicommodity Flows Hajiaghayi, MohammadTaghi; Leighton, F. Thomson We give a pure combinatorial problem whose solution determines max-flow min-cut ratio for directed multicommodity flows. In addition, this combinatorial problem has applications in improving the approximation factor of Greedy algorithm for maximum edge disjoint path problem. More precisely, our upper bound improves the approximation factor for this problem to O(n^{3/4}). Finally, we demonstrate how even for very simple graphs the aforementioned ratio might be very large.