An advanced machine learning project for predicting product prices using ensemble methods, meta-learning, and computer vision techniques. Achieved 50.12% SMAPE using sophisticated stacking approaches.
# Clone the repository git clone <repository-url> cd amazonml-price-prediction # Install dependencies pip install -r requirements.txt # Run the best model (Meta-Learning) cd models python meta_learning_model.py
βββ models/ # π― Main optimized models (BEST)
β βββ meta_learning_model.py # π₯ Best: 50.12% SMAPE
β βββ neural_enhanced_model.py # π₯ 50.45% SMAPE
β βββ computer_vision_model.py # π₯ ~50% SMAPE
β
βββ experiments/ # π§ͺ Research & development
β βββ ensemble/ # Ensemble approaches
β βββ optimization/ # Advanced optimizations
β βββ legacy/ # Earlier experiments
β
βββ results/ # π Model outputs & analysis
βββ scripts/ # π§ Utility scripts
βββ src/ # π Core utilities
βββ dataset/ # πΎ Training/test data
βββ image_cache/ # πΌοΈ Downloaded images (2,183 files)
| Model | SMAPE | Status | Features |
|---|---|---|---|
| Meta-Learning | 50.12% | β Best | Advanced stacking + comprehensive features |
| Neural Enhanced | 50.45% | π₯ Second | Deep learning + feature interactions |
| Computer Vision | ~50% | π§ͺ Experimental | Image features + text analysis |
| Ensemble Models | 70-80% | β Need work | Various ensemble attempts |
π― Target: <48% SMAPE | π Best Achievement: 50.12% SMAPE
- π§ Meta-Learning: Advanced stacking with 7 base models
- πΌοΈ Computer Vision: Real image feature extraction (2,183+ images)
- π NLP: Comprehensive text feature engineering
- β‘ Ensemble Methods: Multiple ensemble approaches tested
- π Robust Validation: Cross-validation with proper SMAPE optimization
- Python 3.8+
- pandas, numpy, scikit-learn
- lightgbm, xgboost
- PIL (for image processing)
- See
requirements.txtfor full list
cd models
python meta_learning_model.pycd models
python computer_vision_model.pycd scripts
python download_computer_vision_images.py- Training Data: 75,000 samples
- Test Data: 75,000 samples
- Image Dataset: 140,587 unique URLs (2,183+ downloaded)
- Feature Engineering: 600+ features per model
- Cross-Validation: 5-fold stratified
See results/ folder for detailed performance analysis.
The experiments/ folder contains extensive research:
- Ensemble Methods: Gradient boosting combinations
- Optimization: Advanced hyperparameter tuning
- Legacy Models: Early development iterations
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Commit changes (
git commit -m 'Add amazing feature') - Push to branch (
git push origin feature/amazing-feature) - Open Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Advanced machine learning techniques
- Meta-learning and stacking approaches
- Computer vision for e-commerce
- Ensemble method research
β Star this repo if it helped you!