Showing posts with label cognition. Show all posts
Showing posts with label cognition. Show all posts

Saturday, June 14, 2008

CS: CS Ground Penetrating Radar, History of Matching Pursuit, Reinforcement Learning Blog, Neuroscience and Dynamic Systems

Andriyan Suksmono at the Institut Technologi Bandug blogs on Compressed Sensing. The blog's name is Chaotic Pearls and most of it is in Indonesian. His latest entry on the use of Compressed Sensing in the context of a Ground Penetrating Radar application. This is new. The presentation (in english) is entitled: A Compressive SFCW-GPR System (also here) by Andriyan Suksmono, Endon Bharata, A. Andaya Lestari, A. Yarovoy, and L.P. Ligthart. The abstract of the paper reads:

Data acquisition speed is an inherent problem of the stepped-frequency continuous wave (SFCW) radars, which discouraging further usage and development of this technology. We propose an emerging paradigm called the compressed sensing (CS), to manage this problem. In the CS, a signal can be reconstructed exactly based on only a few samples below the Nyquist rate. Accordingly, the data acquisition speed can be increased significantly. A novel design of SFCW ground penetrating radar (GPR) with a high acquisition speed capability is proposed and evaluated. Simulation by a mono-cycle waveform and actual measurement by a Vector Network Analyzer in a GPR test-range confirm the implementability of the proposed system.

The architecture looks like this:

and some photos of the experiment are also shown below. The rest of the presentation show some of the reconstruction results using L1 magic.



Here is another blogger. Laurent Jacques, a contributor to this blog, has decided to start his own blog entitled: Le Petit Chercheur Illustré, Yet another signal processing (and applied math). His first technical entry is on an inspiring historical perspective on the Matching Pursuit technique.

Some of you know of my interest in Robotics and Artificial Intelligence. In particular, learning in low dimensional spaces. Two items appeared on my radar this week:

  • A blog: The Reinforcement Learning Blog

    and a paper entitled:

  • Where neuroscience and dynamic system theory meet autonomous robotics: A contracting basal ganglia model for action selection. by B. Girard, Nicolas Tabareau, Quang Cuong Pham, Alain Berthoz, Jean-Jacques Slotine. The abstract reads:
    Action selection, the problem of choosing what to do next, is central to any autonomous agent architecture. We use here a multi-disciplinary approach at the convergence of neuroscience, dynamical system theory and autonomous robotics, in order to propose an efficient action selection mechanism based on a new model of the basal ganglia. We first describe new developments of contraction theory regarding locally projected dynamical systems. We exploit these results to design a stable computational model of the cortico-baso-thalamo-cortical loops. Based on recent anatomical data, we include usually neglected neural projections, which participate in performing accurate selection. Finally, the efficiency of this model as an autonomous robot action selection mechanism is assessed in a standard survival task. The model exhibits valuable dithering avoidance and energy-saving properties, when compared with a simple if-then-else decision rule.
  • Saturday, May 10, 2008

    Compressed Sensing Meets Machine Learning: Classification via Sparse Representation and Distributed Pattern Recognition

    This Spring, Allen Yang has given a mini course at Berkeley entitled Compressed Sensing Meets Machine Learning. The three lectures are listed here (it includes accompanying code):
    The third lecture is more focused on a very interesting applied subject. I am absolutely sure that it can have some important bearings on similar problems I have covered on in this blog earlier, namely gesture understanding and eye tracking. For certain handicaps the detection is not as important as the continuous quantification of how intervention methods are improving the condition:

    Thursday, February 07, 2008

    NIPS 2007 Tutorial videos: Visual Recognition in Primates and Machines and Sensory Coding and Hierarchical Representations


    Some of the NIPS tutorials are out. Of interest are that of Tomaso Poggio who with Thomas Serre has devised a feedforward model of the visual cortex and that of Michael Lewicki on Sensory Coding and Hierarchical Representations when looked in relation to this entry. In the first presentation I note an interesting paper that begins the work of defining a norm based on the hierarchical structure underlying the visual cortex model of Poggio and Serre. It is entitled Derived Distance: towards a mathematical theory of visual cortex by Steve Smale, Tomaso Poggio , Andrea Caponnetto and Jake Bouvrie. And as we have begun to learn, never underestimate when a Fields medalist write about things that we little people can understand. I need to come back to this later.

    Visual Recognition in Primates and Machines by Tomaso Poggio . The slides are here. The abstract of the presentation reads:
    Understanding the processing of information in our cortex is a significant part of understanding how the brain works and of understanding intelligence itself, arguably one of the greatest problems in science today. In particular, our visual abilities are computationally amazing and we are still far from imitating them with computers. Thus, visual cortex may well be a good proxy for the rest of the cortex and indeed for intelligence itself. But despite enormous progress in the physiology and anatomy of the visual cortex, our understanding of the underlying computations remains fragmentary. I will briefly review the anatomy and the physiology of primate visual cortex and then describe a class of quantitative models of the ventral stream for object recognition, which, heavily constrained by physiology and biophysics, have been developed during the last two decades and which have been recently shown to be quite successful in explaining several physiological data across different visual areas. I will discuss their performance and architecture from the point of view of state-of-the-art computer vision system. Surprisingly, such models also mimic the level of human performance in difficult rapid image categorization tasks in which human vision is forced to operate in a feedforward mode. I will then focus on the key limitations of such hierarchical feedforward models for object recognition, discuss why they are incomplete models of vision and suggest possible alternatives focusing on the computational role of attention and its likely substrate – cortical backprojections. Finally, I will outline a program of research to attack the broad challenge of understanding in terms of brain circuits the process of image inference and in particular recognition tasks beyond simple scene classification.

  • Flash Movie Session A

  • Flash Movie Session B



    Sensory Coding and Hierarchical Representations by Michael Lewicki. The slides are here. The abstract description reads:

    The sensory and perceptual capabilities of biological organisms are still well beyond what we have been able to emulate with machines, and the brain devotes far more neural resources to the problems of sensory coding and early perception than we give credit in our algorithms. What is it all doing? Although a great deal has been learned about anatomical structure and physiological properties, insights into the underlying information processing algorithms have been difficult to obtain. Recent work, however, that has begun to elucidate some of the underlying computational principles and processes that biology uses to transform the raw sensory signal into a hierarchy of representations that subserve higher-level perceptual tasks. A central hypothesis in this work is that biological representations are optimal from the viewpoint of statistical information processing, and adapt to the statistics of the natural sensory environment. In this tutorial, I will review work on learning sensory codes that are optimal for the statistics of the natural sensory environment and show how these results provide theoretical explanations for a variety of physiological data in both the auditory and visual systems. This will include work that that has extended these results to provide functional explanations for many non-linear aspects of early auditory and visual processing. I will focus on work on the auditory and visual systems but also emphasize the generality of these approaches and how they can be applied to any sensory domain. I will also discuss work that generalizes the basic theory and shows how neural representations optimally compensate for sensory distortion and noise in neural populations. Finally, I will review work that goes beyond sensory coding and investigates the computational problems involved in computing more abstract sensory properties and invariant features that can subserve higher-level tasks such as perceptual organization and analysis of complex, natural scenes.

  • Flash Movie Session A
  • Flash Movie Session B
  • Thursday, January 10, 2008

    Human Cognition and Biological Regulation of the Neural Network: Advances in X-fragile Syndrome and Alzheimer and a Machine Learning Contest



    Wow.

    Some people don't realize it but as we stand, we currently do not understand why people with Down Syndrome have lower cognitive ability. Talk about something important, we can detect if an embryo has this condition, but we don't know why they will on average have a lower than average cognitive ability. Similarly, in the X-fragile syndrome, that is linked to Autism, we also don't know why people have lower cognitive abilities. This seems to change according to some new findings by Kimberly Huber and her team :

    Dr. Huber previously co-discovered that mice genetically engineered to lack Fmr1 have a defective signaling system in the brain that controls learning in the hippocampus. This system relies on a chemical messenger called glutamate, which under normal circumstances causes nerve cells to make proteins and change their electrical firing patterns in response to learning situations. Without a properly working Fmr1 gene, the glutamate signaling system malfunctions. In 2007 she and colleagues at UT Southwestern found that acetylcholine, another specific signaling chemical, affects the same protein-making factory that glutamate does....

    “We suggest that treatment that affects the acetylcholine system might be a supplement or alternative to drugs targeting the glutamate pathway,” Dr. Huber said.

    In the current study, she and postdoctoral researcher Dr. Jennifer Ronesi investigated a protein, called Homer, which serves as a kind of structural support for the glutamate system. The Homer–glutamate support system is disconnected in Fragile X syndrome. Dr. Huber’s group discovered that this disconnection results in an inability of brain cells to make the new proteins important for learning and memory.

    So while BERT and ERNI seem to be important in controlling brain development, Homer is central to enabling the learning process, who knew ? :-) In light of this finding, I hope that at some point we get a consistent story on why statins seem to be doing a great job for recovering cognitive abilities. In some unrelated story, a

    If you think you have a good machine learning scheme, you might want to try it out on the Neuron Modeling Challenge organized by EPFL. The deadline is beginning of February and they give out cash rewards, something like 10 000 swiss francs, or about 6091 euros and like a million dollars these days :-)

    Photo Credits: Wikipedia.

    Saturday, December 29, 2007

    Compressed Sensing: Random Features for Large-Scale Kernel Machines


    In a previous, I mentioned the potential connection between compressed sensing and the visual cortex through a model that uses random projections. At the latest NIPS conference, it looks like we are beginning to see some convergence in Random Features as an Alternative to the Kernel Trick by Ali Rahimi and Benjamin Recht. The matlab code is on the main webpage. The abstract reads:

    To accelerate the training of kernel machines, we propose to map the input data to a randomized low-dimensional feature space and then apply existing fast linear methods. The features are designed so that the inner products of the transformed data are approximately equal to those in the feature space of a user specified shift invariant kernel. We explore two sets of random features, provide convergence bounds on their ability to approximate various radial basis kernels, and show that in large-scale classification and regression tasks linear machine learning algorithms applied to these features outperform state-of-the-art large-scale kernel
    machines.
    Thanks Jort. So now the classification mechanism is actually simplified through the use of random projections. This is thought provoking.

    Thursday, November 29, 2007

    Outliers in the Long Tail: The case of Autism.

    When reading subjects like this one, I am still amazed that some people can live off blogging and that other people think they can do that too based purely on traffic based ads. It is important to realize that sometimes what is important is quality traffic. This is often quantified in click-through traffic but oftentimes, the important feature of a blog is to bring awareness on a subject that is difficult to communicate on. So the whole issue of power laws on the net as featured in the long tail argument of Chris Anderson does not seem to take into account the quality aspect of it. In other words, while the linking behavior and traffic do follow a power law while applications on Facebook do not, the most important question should be:

    Is there a business case for using the web where the amount of traffic is not directly proportional to income or recognition ?


    Let me take an example featured in this blog. In some cases, people don't know about a subject or the wording associated with that subject but they are very much willing to know more because it affects them. Case in point Autism (it could be any other subject). I have made different postings on the business case to be made for an eye tracking device to detect or quantify autism ( Part I, Part II, Part III). When I look back at the log of queries leading to this blog (after these entries were posted), there is a fair amount of low level queries with the following keywords:

    • " infant eye tracking",
    • "baby not eye tracking well",
    • "eye tracking autism",
    • "baby's eyes not tracking",
    • "newborn eye tracking",
    • "detecting autism early",
    • "eye tracking kids",
    • "babies with eye tracking problems",
    • "autism and follow eyes",
    • "autism and eye tracking importance",
    • "autism and eye movements"........
    I would bet that a fair amount of these queries comes from parents who think that something is amiss with their kids. Linking the eye tracking behavior to Autism has probably given them some headstart with regards to getting a diagnosis. Even though, it may not be a diagnosis of autism, we know from studies that eye tracking is important in the learning process in infants. In a more recent entry, I mentioned the difficulty of getting a diagnosis of Autism using current testing techniques. That post has brought a steady stream of queries along the lines of the following keywords:

    • "adi-r and ados diagnosis",
    • "autism difficulty with standardized testing",
    • "ados failed to detect asd",
    • "age at which autism diagnosed",
    • "pdd-nos stability of diagnosis"
    In both cases, the traffic is seldom large but I am sure the entries have had some amount of influence on the readers. [ In a related note, Aleks Jakulin and Masanao Yajima are in the process of writing another installment of the Venn Diagram blog entry that was started on this blog (current update is here). Much interesting information should come out of their work. This is important work to many people: medical doctors, families and researchers. To medical doctors and families, it points to the weakness of the current testing system. To researchers, making sense of this data should allow them to pinpoint areas of the tests that need refinement and those areas that are too coarse. ]

    With one out of 166 kids being in the Autistic Spectrum Disorder, it would seem to me there is ample work for people just trying to make sense of new findings (and not delivering "solutions" which is the current market). Doctors or families while being directly affected, sometimes do not have the means to interpret the new findings and need somebody who is specialized in this area. I have had several hits from other countries than the U.S. and I would guess the "market" is much larger than what the traffic shows.

    In all, only the web can provide access to less than 1/100 of the population and I would venture that some percentage of that population (my guess is 10 percent) is willing to pay for a service that they cannot get otherwise. Eventually instead of a long tail or power law, we should really think of it in a different manner not unlike that featured by Dan Bricklin




    which reminds me of the Plouffe's inverter graphs that shows how digits in numbers follow power laws (Benford's law):

    One can clearly see outliers in the tail.


    P.S. I am currently not running ads in this blog is very much linked to the fact that the keyword "Autism" features many ads for products for which nobody can guarantee anything.

    Tuesday, October 16, 2007

    Judging the Autism Charts Challenge

    When writing this entry on the difficulty of displaying data when diagnosing Autism at age 2, I did not expect such a large number of people putting their minds into making the graphs better. Andrew Gelman and Masanao Yajima made a challenge out of it. It got picked up by the folks at JunkCharts, EagerEyes.org, and at Perceptual edge and much brain computing cycle went into this Challenge. Previously, I had already stated that I thought Autism was a Grand Challenge and this is in no small part because the diagnosis phase is not early enough, not accurate enough and sometimes not useful enough.
    • Not early enough is the reason why this study [1] by Catherine Lord was performed. Instead of taking a "normal" test at age 3, age 2 is preferred with regards to intervention.
    • Not accurate enough and not useful: there is currently very little difference whatsoever in treatment between kids that have been diagnosed Autistic or with Pervasive Developmental Disorder- Not Other Specified (PDD-NOS).
    Eventually, one cannot even devise clinical trials for drug development until some of these tests are better refined. The figures entered as part of the Challenge can be seen in full scale here:










    Please go to Andrew Gelman and Masanao Yajima's blog and answer the survey at the very end. Please note there are TWO questions.
    • Which plot did you like best?
    • Which plots did you find useful?
    Please answer them both.. Thank you to Antony Unwin, Stack Lee, Robert Kosara, Patrick Murphy, Andrew Gelman and Masanao Yajima for contributing to this effort.

    [1] Autism From 2 to 9 Years of Age, Catherine Lord, Susan Risi, Pamela S. DiLavore, Cory Shulman, Audrey Thurm, Andrew Pickles. Arch Gen Psychiatry. 694-701, Vol. 63, June 2006

    Liked this entry ? subscribe to the Nuit Blanche feed, there's more where that came from

    Tuesday, September 25, 2007

    On the difficulty of Autism diagnosis: Can we plot this better ?

    [ New Update: see here on the Challenge, results and survey]

    [Update: I have put some update in the text of this entry to reflect my better understanding of the graphs which doesn't change the graph itself]

    When I met Catherine Lord a year ago, I was struck by this graph in her presentation (excerpted from "Autism From 2 to 9 Years of Age" [1]):

    How can one understand it ? each circle denotes a set of people that have gone through one test designed to figure out whether or not kids were affected by Autism (PL-ADOS indicates Pre-Linguistic Autism Diagnostic Observation Schedule; ADI-R, Autism Diagnostic Interview–Revised and Clinician indicates that a clinician made an assessment based on an interview with the kid). Intersection between circles points to populations of kids that have gone through several tests [Update: and tested positive at these tests]. The intersection of all three circles indicate people that have gone through the three tests (clinician, ADI-R, PL-ADOS) [and tested positive at these three tests. A kid could conceivably be tested positive on the three tests and not appear in either graph A or B]. The number in the circle indicates the number of kids at age 2 that have been deemed Autistic (left circles) or in the Autistic Spectrum (right circles) testing associated with each circle (Clinician, ADI-R, PL-ADOS). The number in the parenthesis indicates the number of kids that have been diagnosed with the same diagnosis at age 9. Let's take an example: in the first three circles in the left, there is a label indicating: 16 (56%). This can be translated into: [Update: There 16 kids that were diagnosed with Autism at age 2 AND] they were also diagnosed with autism using ADI-R test, yet 56% remained in that diagnosis at age 9. [Update: One more thing: The category Autism [A] and ASD [B] denote kids that have been deemed Autistic or with ASD through a Best Estimate method at age 2, this method use several means to come to that conclusion. The percentages in parenthesis denote whether these kids are still Autistic or with ASD seven year later using a Best estimate method at age 9]. Some of these numbers are simply stunning because they show our current inability to do a good job a determining what constitute reliably autism at age 2. This is all the more important that an earlier diagnosis is really needed to probably change the outcome of this condition. Catherine and her co-workers eventually comment in the paper:

    Diagnosis of autism in 2-year-olds was quite stable up through 9 years of age, with the majority of change associated with increasing certainty of classifications moving from ASD/PDD-NOS to autism. Only 1 of 84 children with best-estimate diagnoses of autism at age 2 years received a nonspectrum diagnosis at age 9 years, and more than half of children initially diagnosed with PDD-NOS later met autism criteria. Nevertheless, more than 10% of children with diagnoses of PDD-NOS at age 2 years received nonspectrum best-estimate diagnoses (ie, not autism or ASD) by age 9 years, and nearly 30% continued to receive diagnoses of PDD-NOS,indicating mild symptoms at age 9 years. A significant minority of children with milder difficulties within ASD at age 2 years showed only mild deficits in the clinical ASD range at age 9 years. Classifications changed substantially more often from ages 2 to 5 years than from ages 5 to 9 years. The bulk of change in diagnosis occurring in early years is consistent with another recent study. At age 2 years, diagnostic groups were more similar in functioning and IQ than the diagnostic groups identified at age 9 years, when the autistic group showed very poor adaptive functioning and the PDD-NOS group, much less abnormal verbal and nonverbal IQ. Among this specialized group of clinicians, clinical judgment of autism at age 2 years was a better predictor of later diagnosis than either standardized interview or observation. Contemporaneous agreement between clinical judgment and best-estimate judgment for 2-year olds was equal to that found between experienced raters in the DSM-IV field trials for older children and adults. Though the clinical diagnoses at age 2 years were made without knowledge of the ADI-R and ADOS algorithm scores, each clinician had administered either the PL-ADOS or the ADI-R and had the opportunity to discuss his or her impressions with the experienced clinician who had administered the other instrument. Thus, the information available to them was very different from the information obtained during a typical single office visit to a clinical psychologist or developmental pediatrician. The use of standardized measures seems likely to have improved the stability of diagnosis both directly through straightforward use of algorithms for autism and ASD and also indirectly through structuring clinical judgment. Of cases in which the classifications yielded by both instruments were not supported by the clinicians at age 2 years, 40% were children with severe mental retardation (and not autism) or children with very difficult behavior (and not autism), while the remainder were mild cases of autism characterized as uncertain. On the other hand, clinical judgments were consistently underinclusive at age 2 years, both for narrow diagnoses of autism and for broader classifications of ASD at age 9 years. Thus, scores from standardized instruments also made real contributions beyond their influence on informing and structuring clinical judgment. Overall, while standardized research instruments at age 2 years did not fully capture the insight in the form of certainty ratings made by experienced, well-trained clinicians, this insight was not by itself sufficient.

    My main problem with this graph is that it does not make sense right away. Even though I have been thinking about it for a while, I still cannot get around a better way of displaying this data. This assessment over time is unique in the annals of Autism studies and hence a major milestone. I wish it were better designed to convey some of its underlying statistics or lack thereof. Examples shown by Andrew Gelman in his blog or Edward Tufte maybe a good starting point. In particular, I am wondering how one could adapt the graphical on cancer survival rate redesigned by Tufte to this study. Initially the Lancet study on cncer survival rate showed this graph:


    Tufte redesigned it to a stunning comprehensible table


    Can we do a better plot of the Autism study ?
    [ Update 1: Andrew has posted the beginning of an answer here ]



    [1] Autism From 2 to 9 Years of Age, Catherine Lord, Susan Risi, Pamela S. DiLavore, Cory Shulman, Audrey Thurm, Andrew Pickles. Arch Gen Psychiatry. 694-701, VOL 63, JUNE 2006

    Monday, September 24, 2007

    Guiding Intent


    No MRI for you: It looks as though one of the primary reason for not using MRI to detect behavior (and make money off of it) is not because the science of dimensionality reduction from brain activity is barely understood. More likely, it is because some type of regulation will forbid the use of MRI altogether. This is a stunning development as there is no scientific ground on which these regulations stand. I am only half joking on this topic as I cannot understand how an entity like the EU can pass a law that in effect will kill people and prevent them from doing Compressed Sensing.

    This news and the fact that not everybody has access to a full scale fMRI system is all the more reason to consider reverting to more passive means of detecting intention. Previously, I mentioned the issue of eye tracking to detect autism (a business case Part I, Part II, Part III). The issue of detecting autism early is all the more important for families that already have a case of autism. They want to know very early if this same condition affect the new siblings. The idea is that through some very early detection and appropriate therapy, the brain may train itself to work better very early, resulting in a tremendous difference in the final diagnosis. As it turns out, these posts also were hinting that gaze following deficiency was a likely culprit for language deficiencies. But there seems to be an additional reason as to why eye tracking could even attract a larger crowd (not just the people affected with autism): Scientific Inference making or guiding intention.

    Michael J. Spivey and Elizabeth Grant [1] did a study in 2003 suggesting a relationship between eye movements and problem-solving by showing that certain patterns of eye movement were reflected as participants got closer to solving the problem. More recently, Laura E. Thomas and Alejandro Lleras
    tried to evaluate this further in this paper [2]
    In a recent study, Grant and Spivey (2003) proposed that eye movement trajectories can implicitly impact cognition. In an "insight" problem-solving task, participants whose gaze moved in trajectories reflecting the spatial constraints of the problem's solution were more likely to solve the problem. The authors proposed that perceptual manipulations to the problem diagram that influence eye movement trajectories during inspection would indirectly impact the likelihood of successful problem solving by way of this implicit eye-movement-to-cognition link. However, when testing this claim, Grant and Spivey failed to record eye movements and simply assumed that their perceptual manipulations successfully produced eye movement trajectories compatible with the problem's solution. Our goal was to directly test their claim by asking participants to perform an insight problem-solving task under free-viewing conditions while occasionally guiding their eye movements (via an unrelated tracking task) in either a pattern suggesting the problem's solution (related group) or in patterns that were unrelated to the solution (unrelated group). Eye movements were recorded throughout the experiment. Although participants reported that they were not aware of any relationship between the tracking task and the insight problem, the rate of successful problem solving was higher in the related than in the unrelated group, in spite of there being no scanning differences between groups during the free-viewing intervals. This experiment provides strong support for Grant and Spivey's claim that in spatial tasks, cognition can be "guided" by the patterns in which we move our eyes around the scene.
    in the paper they eventually claim:

    We believe that eye movement trajectories can serve as implicit “thought” guides in spatial reasoning tasks...Although additional studies are necessary to determine how powerful this link between eye movements and cognition is, it is now clear that not only do eye movements reflect what we are thinking, they can also influence how we think.

    This is fascinating and I wonder how the use of serious games for therapy or cognition improvement might be a good start.



    [1] EYE MOVEMENTS AND PROBLEM SOLVING: Guiding Attention Guides Thought, Elizabeth R. Grant and Michael J. Spivey
    [2] Moving eyes and moving thought: The spatial compatibility between eye movements and cognition, Laura E. Thomas, Alejandro Lleras (or here)
    [3] Image: Evil gives, Gapingvoid.com, Hugh Macleod

    Thursday, August 23, 2007

    We yawn because we care


    Hayabusa may be coming home thanks to the hard work on Japanese engineers at JAXA. There is now a layer for Gigapixel images in Google Earth. This is going to come handy when we are going to retrieve our camera (Geo-CAM R) from HASP (a High Altitude Balloon to be flown in September) and make large maps like we did last year.
    In other news, it also looks like we yawn because we care
    Current results suggest that contagious yawning is impaired in ASD, which may relate to their impairment in empathy. It supports the claim that contagious yawning is based on the capacity for empathy.

    Tuesday, August 14, 2007

    Looking at beautiful things

    [埋込みオブジェクト://www.youtube.com/v/0mCGuPeXmhw]
    Damaris has a small entry on her work currently looking at the Shuttle damaged tiles. We had a similar project at STC where we would look at the belly of the shuttle using an HD camera at 60 frames per second and eventually provide 3D photogrammetry of the belly of the Orbiter from several hundred meters down.




    One of the thesis of the subject was done by Paul Gersting at Texas A&M University under Johnny Hurtado and Mark Lemmon and was entitled: A photogrammetric on-orbit inspection for orbiter thermal protection system. When the shuttle would perform the RPM (Rendez-Vous Pitch Maneuver or shuttle back flip) below the International Space Station, our HEOCam system would have been able to evaluate impact depth as Paul shows in his work.


    The IPAM/UCLA Graduate Summer School on Probabilistic Models of Cognition: The Mathematics of Mind has finished. The presentations and webcasts can be found here.

    Saturday, June 30, 2007

    Of Mice and Floating Men: Dimensionality Reduction of Biological Processes


    Joe Z Tsien in the recent Scientific American (free pdf here until today) talks about how he devised different techniques with mice in order to understand what a memory is. In the course of the analysis he used Multiple Discriminant Analysis (MDA) to analyze how cell activation location where related to specific memories of different events (such as falling or an earthquake). The video of the MDA in action is here. This is fascinating on two counts: a Machine learning technique producing dimension reduction is used to isolate specific elements of an actual learning process. In the detailed version of the article, Tsien remarks that the initial feature space is about 260 and the subspace of interest is 3:

    After collecting the data, we first attempted to tease out any patterns that might encode memories of these startling events. Remus Osan--another postdoctoral fellow--and I analyzed the recordings using powerful pattern-recognition methods, especially multiple discriminant analysis, or MDA. This mathematical method collapses what would otherwise be a problem with a large number of dimensions (for instance, the activities of 260 neurons before and after an event, which would make 520 dimensions) into a graphical space with only three dimensions. Sadly, for classically trained biologists the axes no longer correspond to any tangible measure of neuronal activity but they do map out a mathematical subspace capable of discriminating distinct patterns generated by different events.

    One cannot be but thinking how this tool (MDA) is in fact mimicking the biological process. Tsien then goes on to talk about the hierarchical structure of the memory process not unlike the primary visual cortex model from MIT whereby the visual process reduces the feature space from several hundred dimensions down to very few.

    The second fascinating aspect of this experiment with mice is the similarity with an experiment I went through multiple times: flying in zero gravity. Besides the floating experience, I have always noted the inability of most flyers to clearly remembers their experiences. This is so bad, that in any of the free flying planes used by either NASA or ESA, there are large counters of parabolas informing people how many remaining zero-g periods they will have to experience over the course of two hours.
    One could think that this is due to the fact that people are not accustomed to the conditions and that it is somehow traumatic. It so happens, that even after 50 flights, I still have the hardest time remembering precisely what happened during these flights. My point is : Is the paper by Tsien really locating a real memory or something else ?

    Friday, May 11, 2007

    HCI as a way to quantify Autism in infants ? Part I.

    HCI stands for Human-Computer Interaction and is the computer science field devoted to enable a better connection between computers and humans. David Knossow at INRIA just released his PhD thesis to the TEL server. It is in French but his thesis summary reads:

    Markerless Human Motion Capture with Multiple Cameras My Ph.D manuscript deals with the problem of markerless human motion capture. We propose an approach that relies on the use of multiple cameras and that avoids most of the constraints on the environment and the use of markers to perform the motion capture, as it is generally the case for industrial systems. The absence of markers makes harder the problem of extracting relevant information from images but also to correlate this information between. Moreover, interpreting this extracted information in terms of joint parameters motion is not an easy task. We propose an approach that relies on occluding contours of the human body. We studied the link between motion parameters and the apparent motion of the edges in images. Minimizing the error between the extracted edges and the projection of the 3D model onto the images allows to estimate the motion parameters of the actor. Among the opened issues, we show that using video based motion capture allows to provide additional hints such as contacts between body parts or between the actor and its environment. This information is particularly relevant for improving character animation.


    While the initial interest is in capturing human motion at low cost (as opposed to current systems which cost up to 400,000ドル), I believe this is the beginning of technology development that is central to the study and detection of autism in infants (3-6 months old). The current state of the affairs with regards to Autism detection is that one waits until speech is shown to be very late (about 2 year old) to begin diagnosing the disease even though it has been shown that the brain growth has been abnormal from 0 to 2 as shown by Eric Courchesne. With the advent of the digital world, home movies are beginning to be records of the state of knowledge on the condition of people. Some studies have shown at the same time that home movies could be used to figure out very early that something is not right (search Pub Med with the keywords: Autism movies). For instance, in

    "Early recognition of 1-year-old infants with autism spectrum disorder versus mental retardation" by Osterling JA,Dawson G,Munson JA (Dev Psychopathol. 2002 Spring;14(2):239-51.), one can read the following:

    Results indicated that 1-year-olds with autism spectrum disorder can be distinguished from 1-year-olds with typical development and those with mental retardation. The infants with autism spectrum disorder looked at others and oriented to their names less frequently than infants with mental retardation. The infants with autism spectrum disorder and those with mental retardation used gestures and looked to objects held by others less frequently and engaged in repetitive motor actions more frequently than typically developing infants.

    DARPA Urban Challenge: Obstacle avoidance - Isolating important cues

    I have had requests from several parties to give an overview on the type of obstacle avoidance scheme that might be most promising. Right now, we (Pegasus) are still evaluating some of these, so this entry should not be construed as part of our algorithm unveiling entries but rather a general overview we did a while back. It is important to realize that the main contribution of this entry is really about defining a hardware + software solution to the localization of cues that will be later learned/categorized. One large component of an obstacle avoidance solution is the machine learning/statistical device used to identify rapidly these cues as problematic for the autonomous vehicle or not. This is not unlike the human cortex (see reference section [1] [2] [3]).

    In the case of the Urban Challenge, we are facing not only stopped obstacles but moving one as well. The moving obstacle have behaviors from which one needs to learn from as well. In other words, in the case of vision, a lot of work boils down to producing some amount of cues/features (a small number) from a very large set of data (pixels from an image). In some areas of computer science this is called dimensionality reduction.

    1. Stereo-imaging:
      1. The fly algorithm, a robust stereo algorithm using real time a genetic algorithm (yes, there is such thing as real time genetic algorithm!) and has been tried on cars. and specifically to has been used to avoid people and other objects. The initial thesis with the algorithm is in french. Improvement over the thesis have been focused on the car driving experience.
      2. There are also numerous commercial solutions as listed by the folks at V2_lab's where they discuss each of them. I found this entry pretty revealing about the state of the affairs with regards to stereovision, you have to look at the comment section
        For most stereo matching algorithms the Firewire cameras produce higher quality uncompressed images that do not wreak havoc on sensitive feature detectors. I use the Unibrain Fire-I board camera http://www.unibrain.com/index.html with the CMU 1394 Digital Camera API (Windows), which gives you very complete control and works with just about any Firewire camera, because they all use the same standard interface. http://www.cs.cmu.edu/~iwan/1394/ . When I read the technical reports for the 2005 DARPA Grand Challenge almost every report showed pictures of vehicles equiped with stereo pairs of cameras, but at the race just about all of them had been removed, presumably because of basic issues such as camera synchronization.


    2. Monocular evaluation of the distance field. There are two approaches that caught our attention:
      1. Vision-based Motion Planning for an Autonomous Motorcycle on Ill-Structured Road, Dezhen Song, Hyun Nam Lee, Jingang Yi and Anthony Levandowski from the bike entry at the last Grand Challenge, and
      2. Depth Estimation Using Monocular and Stereo Cues / 2197, Ashutosh Saxena, Jamie Schulte, Andrew Y.Ng and Learning Depth from Single Monocular Images, Ashutosh Saxena, Sung Chung, and Andrew Y. Ng. In NIPS 18, 2006. [ps, pdf]
      3. (With regards to Monocular information, we should not forget the excellent Mono-SLAM : This website has a Matlab implementation of the SLAM using monocular vision. There is a timely thesis on the subject here where it looks like using two cameras implementing both the monoslam algorithm.)
    3. A Random Lens Imager, it is a hardware implementation of a totally new concept in data processing known as compressed sensing (don't ask anybody around you about it because it is too new). It needs only one camera but much of the work goes into the calibration.

    References:

    [1] T. Serre, L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio. Object recognition with cortex-like mechanisms. In: IEEE Transactions on Pattern Analysis and Machine Intelligence, 29 (3), pp. 411-426 , 2007

    [2] T. Serre. Learning a dictionary of shape-components in visual cortex: Comparison with neurons, humans and machines, PhD Thesis, CBCL Paper #260/MIT-CSAIL-TR #2006-028, Massachusetts Institute of Technology, Cambridge, MA, April, 2006
    (Page 154-163 have the model parameters).

    [3]
    T. Serre, L. Wolf and T. Poggio. Object recognition with features inspired by visual cortex. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), IEEE Computer Society Press, San Diego, June 2005

    Extended paper: [CBCL Paper #243/AI Memo #2004-026] and the code is here.

    Tuesday, April 17, 2007

    Compressed Sensing in the Primary Visual Cortex ?


    [Update, please note that the Compressed Sensing Blog is at this address: http://nuit-blanche.blogspot.com/search/label/CS ]

    Thomas Serre, Aude Oliva and Tomaso Poggio just came out with a paper showing that the brain processes information in a feedforward fashion. i.e. in most of the brain architecture, there is no feedback loop. It's a breakthrough since even though the biology seemed to show that, there was little computational modeling that could support this hypothesis. Most of the modeling is featured in Thomas Serre's Ph.D thesis entitled:

    Learning a dictionary of shape-components in visual cortex: Comparison with neurons, humans and machines, PhD Thesis, CBCL Paper #260/MIT-CSAIL-TR #2006-028, Massachusetts Institute of Technology, Cambridge, MA, April, 2006

    I hinted on this earlier, but compressed sensing seems to be such a robust technique that there is little reason to believe that it is not part of a biological process in the works in the brain. Then I found this following statement in page 3 of the preprint of the PNAS paper (Thanks to Thomas Serre I found it in the final paper, it is in the footnote section here) :

    Functional organization:

    Layers in the model are organized in feature maps which may be thought of as columns or clusters of units with the same selectivity (or preferred stimulus) but with receptive fields at slightly different scales and positions (see Fig. S 1). Within one feature map all units share the same selectivity, i.e., synaptic weight vector w which is learned from natural images (see subsection A.1.2).

    There are several parameters governing the organization of individual layers: K_X is the number of feature maps in layer X. Units in layer X receive their inputs from a topologically related N_X × N_X × S_X, grid of possible afferent units from the previous layer where NX defines a range of positions and SX a range of scales.
    Simple units pool over afferent units at the same scale, i.e., SSk contains only a single scale element. Also note that in the current model implementation, while complex units pool over all possible afferents such that each unit in layer Ck receives nCk = NS Ck × NS Ck × SCk , simple units receive only a subset of the possible afferent units (selected at random) such that nSk < NSk × NSk(see Table S 1 for parameter values).

    Finally, there is a downsampling stage from Sk to Ck stage. While S units are computed at all possible locations, C units are only computed every Ck possible locations. Note that there is a high degree of overlap between units in all stages (to guarantee good invariance to translation). The number of feature maps is conserved from Sk to Ck stage, i.e., KSk = KCk. The value of all parameters is summarized in Table S 1.



    So it looks like that in this layered approach to stimuli understanding, the current modeling allows for randomly picking some stimuli out of several in order to go to a higher level of synthesis. This approach is very similar to compressed sensing or some of the concept developed in the Uniform Uncertainty Principle (since we know that natural images for the most part can be sparse in the fourier domain) developed by Terry Tao, Emmanuel Candes and Justin Romberg. Two features of this model can be mapped to the Compressed Sensing approach: a feedback mechanism could be mapped into the usual transform coding approach (compute all the wavelets coefficients and take only the largest ones) whereas Compressed Sensing avoids the nonlinear process of the feedback mechanism. Random sampling is the current best approach to provide a uniform sampling strategy irrespective to most known basis (sines, cosines, wavelets, curvelets,....)

    Sunday, April 08, 2007

    DARPA Urban Challenge: Unveiling our algorithm

    In a previous entry, I mentioned the fact that we are unveiling our strategy.unveiling our strategy for our entry in the DARPA Urban Challenge (DARPA Urban Challenge is about driving at a record pace in some urban environment past many urban difficulties, including no GPS capability). This was not accurate, in fact, we really are going to unveil our algorithm. You'll be privy of the development quirks and everything that goes on implementing an algorithm that has to respond on-line to a challenging situation. I'll be talking on the history of why we are choosing specific algorithms over others. I will specifically talk more about manifold-based models for decision making in the race and the use of techniques devised to produce a storage device of previous actions in order to produce some sort of supervised learning capability. In the previous race, we were eliminated early mostly because we were plagued with mechanical problems that most of us had never faced before (none of us had robotics background), we hope to go farther this time as the vehicle is OK. For reference, we have already shown some of our drive by wire program before as well. We made some of our data available before and I expect, as time permit to do the same as we go along. Because our entry is truely innovative, we are trying to balance not getting eliminated by passing every steps of the application and those innovation in the algorithm. However, since all of us are not interested in just an autonomous car, our emphasis will always be on the side of doing something that most other entries are not attempting such as using compressed sensing and robust mathematical techniques for instance.

    Thursday, March 22, 2007

    Compressed Sensing, Primary Visual Cortex, Dimensionality Reduction, Manifolds and Autism

    In a previous entry, I mentioned the potential connection between compressed sensing, the primary cortex and cognition deficit diseases like autism without much explanation. Here is an attempt at filling the holes.

    When David Field and Bruno Olshausen showed that the primary cortex was getting inputs from our eyes as a set of sparse functions that looked like ridgelets and curvelets, it became obvious that one result was missing: If natural images are sparse and our eye system has sparse receptors, is there a way our brain finds a sparse decomposition of the world in a way that works in a linear fashion? The thinking goes that our brain is really capable of understanding scenes without an iteration process (an iteration process is nonlinear and has a high cost in terms of energy). When Emmanuel Candes and David Donoho showed that in fact, non-adaptive schemes using curvelets could decompose natural images it became obvious that a good parallel could be made between the physiology of the primary cortex and this new type of decomposition. But how do you do this decomposition ? While an m-term curvelet expansion of a scene can be thresholded and can rival with complex adaptive approximation schemes, it does not answer how the primary cortex eventually comes up with that m number.

    The state of the art on our thinking about the primary cortex can be found here in this review by Graham and Field on sparse coding in the neocortex. It specifically addresses the bounds on the primary cortex induced by the metabolic constraints:

    We conclude our discussion by returning to the issue of metabolic constraints. Could we argue that primary evolutionary pressure driving towards sparse coding is one related to the metabolic costs of neural firing? As noted earlier, both Attwell and Laughlin (2001) and Lennie (2003) argue that there are not enough resources to achieve anything but a low-activity system. Moreover, when we find sparse activity in frontal cortex (Abeles et al., 1990), it is more difficult to argue that the sparse activity must arise because it is mapping the sparse structure of the world. Even at early levels, if sparseness were metabolically desirable, there are a number of ways of achieving sparseness without matching the structure of the world. Any one of a wide variety of positively accelerating nonlinearities would do. Simply giving the neurons a very high threshold would achieve a sparse code, but the system would lose information. We argue that the form of sparse coding found in sensory systems is useful because such codes maintain the information in the environment, but do so more efficiently. We argue that the evolutionary pressure to move the system towards a sparse code comes from the representational power of sparse codes. However, we do accept that metabolic constraints are quite important. It has been demonstrated that at the earliest levels of the visual system, ganglion cells (Berry, Warland and Meister, 1997) and lateral geniculate nucleus cells (Reinagel and Reid, 2000) show sparse (non-Gaussian) responses to temporal noise. A linear code, no matter how efficiently it was designed would not show such sparse activity, so we must assume that the sparseness is at least in part due to the nonlinearities in the system and not due to the match between the receptive fields and the sparse structure of the input. As with the results show sparse responses in non-sensory areas, we must accept that metabolic may also be playing a significant role.

    Well it's nice to acknowledge we have physical limitations, but to assume that a linear code cannot exist simply because we currently do not have a model for it, is probably assuming too much. So what do we know ? the primary cortex is a low energy system which basically removes from consideration any complex system that requires resources (like an iteration system). This situation favors a linear system but so far, we have not found a good model for that. There is something deeper still. Even if we knew much about the sparsity of a scene, understanding the brain is really about understanding how large amount of information is reduced when traveling from the eye into the brain. In other words, we need to reduce the amount of data that our megapixel sensor called the eye is bringing in, and we must do this very fast (30 times a second). To put things in perspective, let us take an example: let us imagine we are seeing a scene where somebody waves a hand. If we were to take a video of this scene, we would probably get a 40 MB avi file (uncompressed). That file could then be compressed to 1 MB using MPEG for instance. While the compression is impressive, it is not impressive enough. In effect, when our brain sees this video, it can remember the hand and how it moved. The movement is probably two dimensional and so the brain really remembers the two parameters needed to produce a hand that waves in the manner that is shown in the video. In other words, the brain is probably not storing 1 MB of information when it stores this hand waving activity, it is most probably storing how two parameters changed over time, which in many occasion is much less than 1 MB of data. The real question becomes: Is there a way to reduce that the 1MB information further ? We are not asking ourselves if the input or the receptor are sparse (it is a necessary condition), we are interested in answering the question on how sparser we can make this information by using the connection between these sparse elements. Can we reduce the dimensionality of the signal further and exploit it ?

    Enters Dimensionality reduction: Ever since the discovery of dimensionality reduction schemes (LLE, Isomap, Laplacian-diffusion..) that are taking high dimensionality data and and are able to map them into low dimensionality manifolds, researchers have been trying to extend these techniques to wider sets of problems. In the cognition world for instance, Jonathan Pillow and Eero Simoncelli perform dimensionality reduction applied to neural models but it is not obvious how these techniques directly translate into a specific functionality of the primary cortex even if they take an example of a V1 cell. It is also not obvious how some of these techniques are robust to noise. But as stated earlier, there are different ways to go about dimensionality reduction. One of the most intriguing which has robustness built into it is Compressed Sensing. Compressed Sensing has the ability to produce a robust decomposition of a manifold. Mike Wakin looked into that during his dissertation and found that smooth manifolds can be readily compressed using Compressed Sensing thereby making it a very simple solution to dimensionality reduction (see R. G. Baraniuk and M. B. Wakin in Random Projections of Smooth Manifolds ) but as Donoho and Grimes had shown earlier, sharp objects such as arms, legs have edges and that makes the manifold non-differentiable. This is a problem because it means that one cannot easily extract parameters from a video if we have these sharp edges. In order take care of that problem, one can be inspired by Biology i.e. to smooth these images using Gabor wavelets as in the human vision system (Object Recognition with Features Inspired by Visual Cortex by Thomas Serre, Lior Wolf and Tomaso Poggio) and then use the random projection of smooth manifolds to eventually figure out the parameters of the movements ( for more information on how to do that see High-resolution navigation on non-differentiable image manifolds or the Multiscale Structure of Non-Differentiable Image Manifolds)

    [in Object Recognition with Features Inspired by Visual Cortex by Thomas Serre, Lior Wolf, Tomaso Poggio one may note that one can only be struck by the pain the algorithm goes through into in order to be robust.

    • S1: Apply a battery of Gabor filters to the input image. The filters come in 4 orientations θ and 16 scales s (see Table 1). Obtain 16×4 = 64 maps (S1)sθ that are arranged in 8 bands (e.g., band 1 contains filter outputs of size 7 and 9, in all four orientations, band 2 contains filter outputs of size 11 and 13, etc).
    • C1: For each band, take the max over scales and positions: each band member is sub-sampled by taking the max over a grid with cells of size NΣ first and the max between the two scale members second, e.g., for band 1, a spatial max is taken over an 8 ×8 grid first and then across the two scales (size 7 and 9). Note that we do not take a max over different orientations, hence, each band (C1)Σcontains 4 maps.
    • During training only: Extract K patches Pi=1,...K of various sizes ni × ni and all four orientations (thus containing ni × ni × 4 elements) at random from the (C1)Σ maps from all training images.
    • S2: For each C1 image (C1)Σ, compute: Y = exp(−γ||X − Pi||2) for all image patches X (at all positions) and each patch P learned during training for each band independently. Obtain S2 maps (S2)Σi .
    • C2: Compute the max over all positions and scales for each S2 map type (S2)i (i.e., corresponding to a particular patch Pi) and obtain shift- and scale-invariant C2 features (C2)i , for i = 1 . . .K.


    ]
    Besides Wakin, Donoho, Baraniuk and others collaborating with them, few have made that connection. Yves Meyer makes a passing reference to the use of compressed sensing in physiology (inPerception et compression des images fixes.) Gabriel Peyre however is a little more specific here.
    Analogies in physiology:
    This compressed sampling strategy could potentially lead to interesting models for various sensing operations performed biologically. Skarda and Freeman have proposed a non-linear chaotic dynamic to explain the analysis of sensory inputs. This chaotic state of the brain ensures robustness toward unknown events and unreliable measurements, without using too many computing resources. While the theory of compressed sensing is presented here as a random acquisition process, its extension to deterministic or dynamic settings is a fascinating area for future research in signal processing.
    I am mentioning Gabriel Peyre's work because he works on bandelets. I had mentioned bandelets before in the context of an announcement made by the company Let it wave (headed by Stephane Mallat ) where they showed that faces could be compressed down to 500 bytes or the size of a bar code.



    If bandelets provide a recognizable Faces nonlinearly with 500 bytes, one only needs 5 x 500 bytes = 2.5 KB random samples (within the meaning of Compressed sensing) of that Face to be able to reconstruct it. 2.5KB is better than 10 MP or 1 MB. The number 5 is the bound for compressed sensing (more specific asymptotic laws/results can be found in the summary of Terry Tao)

    However, the strongest result so far is the one found by Mike Wakin on random projection on a manifold where he uses neighborhood criteria in the compressed sensing space to permit an even better reconstruction than just assuming sparsity. In the figure below, one can compare the 5 random projection results using the Manifold based recovery from the traditional algorithms used like Orthogonal Matching Pursuit and Basis Pursuit.



    Naturally, an extension of this is target detection as featured in the Smashed Filter article from Rice. With this type of result/framework, I am betting we can go lower than 2.5 KB of universal samples to characterize a face. The connection between cognition and compressed sensing and autism is simple: Face processing seem deficient for people affected with autism and we don't know why. A model based on the elements I just mentioned might give an insight into this issue.
    Subscribe to: Comments (Atom)

    Printfriendly

    AltStyle によって変換されたページ (->オリジナル) /