Transforming data into actionable insights through advanced analytics and machine learning
Data Scientist passionate about solving complex business problems through data-driven approaches. My work spans the entire data science lifecycle: from exploratory analysis and feature engineering to model development, deployment, and monitoring in production environments.
class DataScientist: def __init__(self): self.name = "Iñaki Rosello" self.role = "Data Scientist & CS Student" self.location = "Buenos Aires, Argentina" def expertise(self): return { "analytics": ["EDA", "Statistical Analysis", "Data Visualization"], "modeling": ["Classification", "Clustering", "Recommendation Systems"], "mlops": ["Model Deployment", "Monitoring", "CI/CD Pipelines"], "business": ["Fraud Detection", "Churn Prediction", "Demand Forecasting"] } def currently_learning(self): return ["Deep Learning", "Advanced MLOps", "Price Optimization"]
Customer Churn Prediction with Full MLOps Pipeline
Production ML system for FinTech churn prediction optimized for Recall (0.76). Complete pipeline: SMOTE balancing, model comparison (RF, XGBoost, LogReg), hyperparameter tuning, SHAP explainability, MLflow tracking, Evidently drift detection, FastAPI + Docker deployment.
Stack: Scikit-learn, MLflow, Evidently, SHAP, FastAPI, Docker
Skills: Imbalanced Data, Model Selection, Explainable AI, Production MLOps
Retail Demand Forecasting & Price Optimization
Complete data science pipeline for demand estimation and optimal pricing strategy. Integrates econometric analysis (price elasticity) with predictive modeling.
Stack: Python, Pandas, Scikit-learn, Optimization
Skills: Feature Engineering, Time Series, Business Analytics
E-commerce Fraud Detection System
Final project for EDVAI Bootcamp. Production-ready fraud detection using clustering + classification. Includes API, Docker containerization, and Gradio UI.
Stack: Scikit-learn, FastAPI, Docker, Gradio
Skills: Unsupervised Learning, Model Deployment, API Design
Hybrid Recommendation Engine
Personal project exploring recommendation systems. Implements content-based + collaborative filtering with efficient similarity search using FAISS & ANNOY.
Stack: FAISS, ANNOY, FastAPI, Gradio
Skills: RecSys, Vector Search, Similarity Algorithms, API Development
📊 Data Science & Analytics
Core Libraries
Pandas
NumPy
SciPy
Visualization
Matplotlib
Plotly
Seaborn
🚀 MLOps & Production
Experiment Tracking & Monitoring
MLflow
Evidently
Deployment & APIs
FastAPI
Gradio
Docker
Cloud & Infrastructure
Azure
- 🎓 Computer Science - Universidad de Buenos Aires (Currently Studying)
- 📜 Data Science & MLOps Bootcamp - EDVAI (Completed)
- 🏆 AlixPartners Case Competition 2025 - Participant
- 💼 NoCountry Simulation - Data Science Team Member
- 📚 Continuous Learning: Deep Learning, Advanced Analytics (EDA, Data Preparation), Production ML
Current Learning Path:
- 🧠 Deep Learning fundamentals and advanced architectures
- 📈 Time series forecasting and demand prediction
- 🔧 Production-grade MLOps practices and automation
- 💰 Price optimization and econometric modeling
- 🐳 Scalable deployment with Docker and cloud services
Core Competencies:
- Analytics: Exploratory Analysis, Statistical Modeling, Business Intelligence
- Machine Learning: Classification, Clustering, Recommendation Systems, Fraud Detection
- MLOps: Model Deployment, Monitoring & Drift Detection, CI/CD Pipelines, Containerization
- Business Applications: Churn Prediction, Demand Forecasting, Price Optimization
📊 FinTech Churn Prediction - Technical Overview
Business Context: Customer retention prediction for a FinTech platform
Challenge: Highly imbalanced dataset (80% no-churn, 20% churn)
Optimization Goal: Maximize Recall to minimize false negatives (missed churners)
Data Pipeline:
- Preprocessing: Feature engineering on
cleaned_data.csv - Scaling: StandardScaler for linear models
- Balancing: SMOTE oversampling on training set
- Split: 80/20 train-test with stratification
Model Development:
| Model | Hyperparameter Tuning | Recall | F1-Score | ROC AUC |
|---|---|---|---|---|
| Random Forest | RandomizedSearchCV → GridSearchCV | 0.66 | 0.57 | 0.83 |
| XGBoost | RandomizedSearchCV → GridSearchCV | 0.55 | 0.59 | 0.84 |
| Logistic Regression ✅ | RandomizedSearchCV → GridSearchCV | 0.76 | 0.56 | 0.84 |
Winner: Logistic Regression with class_weight='balanced'
Reason: Highest Recall (0.76), meeting business requirement of catching churners
MLOps Implementation:
- Experiment Tracking: MLflow for all runs, parameters, and metrics
- Model Explainability: SHAP for feature importance analysis
- Model Artifacts: Serialized model + scaler with pickle
- Monitoring: Evidently AI for drift detection
- Deployment: FastAPI + Docker containerization
- CI/CD: Automated retraining pipeline
Key Learnings:
- SMOTE effectively handled class imbalance
- Linear models outperformed tree-based for this use case
- Recall optimization crucial for business impact
- SHAP provided actionable insights for stakeholders
🔐 Fraud Detection System - Architecture & Approach
Project Type: EDVAI Bootcamp Final Project
Domain: E-commerce Transaction Fraud Detection
Methodology:
- Unsupervised Learning: Clustering to identify fraud patterns
- Supervised Learning: Classification on clustered features
- Ensemble Approach: Combined insights from both techniques
Production Features:
- RESTful API with FastAPI
- Docker containerization for deployment
- Interactive Gradio UI for demos
- Real-time fraud scoring
Technologies: Scikit-learn, FastAPI, Docker, Gradio
🎬 Movie Recommendation System - Implementation Details
Approach: Hybrid Recommendation System
Components:
- Content-Based Filtering: TF-IDF on movie metadata
- Collaborative Filtering: User-item interaction matrix
- Similarity Search: FAISS & ANNOY for efficient retrieval
Performance:
- Sub-second response time for recommendations
- Scalable to millions of items
- API-ready deployment
Technologies: FAISS, ANNOY, FastAPI, Gradio
Data Science Projects:
- ✅ Churn Prediction (FinTech) - Achieved 0.76 Recall through SMOTE + Logistic Regression optimization
- ✅ Fraud Detection System - Built production-ready fraud classifier with clustering + classification approach
- ✅ Recommendation Engine - Implemented hybrid RecSys with FAISS/ANNOY for sub-second retrieval
- ✅ Demand Forecasting - Created econometric optimization model for retail pricing strategy
Technical Skills:
- 📊 End-to-end data science workflows with imbalanced data handling
- 🤖 Model selection and hyperparameter optimization (RandomizedSearchCV, GridSearchCV)
- 🔍 Advanced feature engineering and SMOTE oversampling
- 📈 Model interpretability with SHAP and explainable AI
- 🚀 Production deployment with MLflow tracking and Evidently monitoring
- 📉 Drift detection and automated model retraining pipelines
Bootcamp & Competitions:
- 🎯 EDVAI Data Science & MLOps Bootcamp
- 🏆 AlixPartners 2025 Case Competition - Demand Forecasting Hackaton Competition
- 💼 NoCountry Job Simulation - FinTech Churn Prediction Project
# My typical approach to DS projects def data_science_project(problem): """ End-to-end data science methodology """ # 1. Problem Understanding business_context = understand_business_problem(problem) success_metrics = define_kpis(business_context) # 2. Data Collection & Exploration data = collect_data(sources) insights = exploratory_analysis(data) # 3. Data Preparation clean_data = handle_missing_values(data) features = feature_engineering(clean_data) X_train, X_test = split_and_scale(features) # 4. Modeling models = train_multiple_models(X_train) best_model = optimize_hyperparameters(models) results = evaluate_model(best_model, X_test) # 5. Deployment & Monitoring api = deploy_model(best_model) monitor_performance(api) return production_ready_solution
I'm always open to collaborating on data science projects, discussing new techniques, or connecting with fellow data enthusiasts and professionals!
📧 Open to opportunities in Data Science, ML Engineering, and Analytics roles
Thanks for visiting! ⭐ If you find my projects interesting, consider giving them a star!