TRM: Tiny AI Models beating Giants on Complex Puzzles
Models with billions, or trillions, of parameters are becoming the norm. These models can write
In my previous post I provided step by step instructions on how to install NVIDIA DIGITS 3 on Amazon EC2. In this post, we are going to use an Amazon Machine Image (AMI) that I have configured for readers of this article. This AMI comes preloaded with DIGITS 3 and
In my previous post I provided step by step instructions on how to install NVIDIA DIGITS 3 on Amazon EC2. In this post, we are going to use an Amazon Machine Image (AMI) that I have configured for readers of this article. This AMI comes preloaded with DIGITS 3 and the 17 flowers dataset from Oxford Visual Geometry Group. We will use this AMI to launch an instance on Amazon EC2 quickly and try a couple of Deep Learning experiments.
In the video below we show how to launch an instance on Amazon EC2 using the AMI I have shared. We explore basic usage of DIGITS 3 starting with data preparation, database exploration, training a neural network, improving performance, and testing the learned neural network on a new image.
The AMI I have shared ( id : ami-5bac4e3b, region : US West ( Oregon ) ) has NVIDIA DIGITS 3 preinstalled. I have also included the 17 Flowers dataset from Oxford’s Visual Geometry Group in the AMI at /home/ubuntu/data/17flowers. In addition AlexNet weights are included for pretraining at /home/ubuntu/models
To demo DIGITS 3 we trained AlexNet with default training parameters on the 17 flowers dataset. After about 4 minutes of training, AlexNet produced an accuracy of 67%.
As a quick demo 67% is not bad but can we do better ? Of course!
By simply using pre-trained AlexNet weights and making some minor modifications, we see a huge improvement in accuracy ( > 90 % ).
Models with billions, or trillions, of parameters are becoming the norm. These models can write
Deploying ML on Arduino Nano 33 BLE. Explore TinyML techniques, setup steps, and why older
Discover VideoRAG, a framework that fuses graph-based reasoning and multi-modal retrieval to enhance LLMs’ ability
Discover VideoRAG, a framework that fuses graph-based reasoning and multi-modal retrieval to enhance LLMs' ability to understand multi-hour videos efficiently.
Learn how to build AI agent from scratch using Moondream3 and Gemini. It is a generic task based agent free from…
Get a comprehensive overview of VLM Evaluation Metrics, Benchmarks and various datasets for tasks like VQA, OCR and Image Captioning.
Subscribe to our email newsletter to get the latest posts delivered right to your email.
We hate SPAM and promise to keep your email address safe.