Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Fraga9/VibeMatch

Repository files navigation

VibeMatch Banner

AI-Powered Music Compatibility Matching

MIT License Python Next.js FastAPI PyTorch

Find your musical soulmate using Graph Neural Networks and vector similarity search

DemoDocsReport BugRequest Feature


What is VibeMatch?

VibeMatch connects users with similar music taste by analyzing their listening patterns through deep learning. Unlike traditional systems that compare artist lists, VibeMatch captures complex musical relationships through a Graph Neural Network trained on millions of music interactions.

Key Innovation: We don't just match artists you both like—we understand the musical relationships between artists to find truly compatible taste profiles.

Key Features

  • Deep Music Understanding: Graph Neural Network analyzes 338K+ tracks and 6.9K+ artists to learn musical relationships
  • Instant Matching: Sub-10ms vector search across user profiles using Qdrant
  • Multi-temporal Analysis: Combines long-term preferences with recent listening trends
  • Cold Start Solution: Synthetic profiles ensure immediate matches for new users
  • Privacy-First: Only uses public Last.fm data, GDPR compliant

Screenshots

How It Works

graph LR
 A[Last.fm Profile] --> B[Multi-Period Fetch]
 B --> C[Weighted Embedding]
 C --> D[Qdrant Search]
 D --> E[Ranked Matches]
 style A fill:#e0f2fe,stroke:#0ea5e9,stroke-width:2px,color:#0c4a6e
 style B fill:#ffe4e6,stroke:#f43f5e,stroke-width:2px,color:#881337
 style C fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#14532d
 style D fill:#f3e8ff,stroke:#a855f7,stroke-width:2px,color:#581c87
 style E fill:#fef3c7,stroke:#eab308,stroke-width:2px,color:#713f12
Loading

Step-by-step:

  1. Authentication: Connect your Last.fm account via OAuth
  2. Data Fetching: Retrieve listening history across multiple time periods (all-time, 6mo, 3mo, recent)
  3. Embedding Generation: Your music taste is encoded into a 128D vector using the trained GNN (~500ms)
  4. Vector Search: Qdrant finds top-K compatible users via cosine similarity (<10ms)
  5. Results: View matches with compatibility scores and shared artists

System Architecture

graph LR
 subgraph Frontend["Frontend Layer"]
 direction TB
 A[Next.js 15<br/>TypeScript]
 A1[Tailwind UI]
 A2[Zustand State]
 A --> A1
 A --> A2
 end
 subgraph Backend["API Layer"]
 direction TB
 B[FastAPI<br/>Async Server]
 C[Embedding<br/>Service]
 D[LRU Cache<br/>8K entries]
 
 B --> C
 C --> D
 end
 subgraph ML["ML Pipeline"]
 direction TB
 E[PyTorch +<br/>PyG]
 F[LightGCN<br/>3 Layers]
 G[Embeddings<br/>338K tracks<br/>6.9K artists]
 
 E --> F
 F --> G
 end
 subgraph VectorDB["Vector Search"]
 direction TB
 H[(Qdrant<br/>Database)]
 I[HNSW Index<br/>Cosine Sim]
 
 H --> I
 end
 subgraph External["External APIs"]
 J[Last.fm<br/>OAuth + Data]
 end
 Frontend -->|REST API| Backend
 Backend -->|Vector Query| VectorDB
 Backend -->|Auth + Fetch| External
 Backend -.->|Embedding Lookup| ML
 VectorDB -.->|Store Vectors| Backend
 style Frontend fill:#e0f2fe,stroke:#0ea5e9,stroke-width:2px,color:#0c4a6e
 style Backend fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#14532d
 style ML fill:#ffe4e6,stroke:#f43f5e,stroke-width:2px,color:#881337
 style VectorDB fill:#f3e8ff,stroke:#a855f7,stroke-width:2px,color:#581c87
 style External fill:#fef3c7,stroke:#eab308,stroke-width:2px,color:#713f12
 
 style A fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985
 style B fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534
 style E fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style H fill:#e9d5ff,stroke:#9333ea,stroke-width:2px,color:#6b21a8
 style J fill:#fef08a,stroke:#ca8a04,stroke-width:2px,color:#854d0e
Loading

Machine Learning

LightGCN (Light Graph Convolutional Network)

graph LR
 subgraph GraphStructure["Graph Structure"]
 direction TB
 T1[Track 1] 
 T2[Track 2]
 T3[Track 3]
 A1[Artist 1]
 A2[Artist 2]
 
 T1 -.->|authored by| A1
 T2 -.->|authored by| A1
 T3 -.->|authored by| A2
 T1 ---|co-occurrence| T2
 A1 ---|similar| A2
 end
 subgraph Training["GNN Training Pipeline"]
 direction TB
 G[345K Nodes<br/>2.7M Edges] --> L[3-Layer LightGCN]
 L --> E[128D Embeddings]
 E --> N[L2 Normalization]
 end
 subgraph Optimization["Training & Optimization"]
 direction TB
 N --> BPR[BPR Loss]
 BPR --> Adam[Adam Optimizer]
 Adam --> M[Model Weights]
 end
 GraphStructure --> Training
 Training --> Optimization
 style GraphStructure fill:#e0f2fe,stroke:#0ea5e9,stroke-width:2px,color:#0c4a6e
 style Training fill:#ffe4e6,stroke:#f43f5e,stroke-width:2px,color:#881337
 style Optimization fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#14532d
 
 style G fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985
 style L fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style E fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style N fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style BPR fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534
 style Adam fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534
 style M fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534
Loading

Model Specs:

  • Embeddings: 128 dimensions, L2 normalized
  • Architecture: 3-layer graph convolution
  • Training: Bayesian Personalized Ranking (BPR) loss
  • Performance: Recall@10: 0.64, Precision: 1.00
  • Graph: 345K nodes (338K tracks, 6.9K artists), 2.7M edges

Model Specs:

  • Embeddings: 128 dimensions, L2 normalized
  • Architecture: 3-layer graph convolution
  • Training: Bayesian Personalized Ranking (BPR) loss
  • Performance: Recall@10: 0.64, Precision: 1.00
  • Graph: 345K nodes (338K tracks, 6.9K artists), 2.7M edges

Tech Stack

Frontend

  • Next.js 15 (App Router, RSC)
  • TypeScript (strict mode)
  • Tailwind CSS
  • Zustand (state management)
  • Deployed on Vercel

Backend

  • FastAPI (async REST API)
  • PyTorch + PyTorch Geometric
  • Qdrant (vector database)
  • LRU cache (8K entries)
  • DigitalOcean App Platform

Performance

Metric Value
Embedding Generation ~500ms
Vector Search <10ms
End-to-End Latency <800ms
Model Size ~168MB
User Embedding Coverage ~95% (exact + fuzzy + zero-shot)

Dataset

Built from Last.fm data:

  • Source: Last.fm augmented dataset with artist similarity graph
  • Tracks: 338,046 Last.fm tracks
  • Artists: 6,899 unique artists
  • Relationships: 2.7M edges (track-artist, track-track, artist-artist)
  • Genre coverage: 95.9% of tracks have genre assignments

Coverage breakdown for user embeddings:

  • Exact matches: ~60%
  • Fuzzy matches: ~25%
  • Zero-shot inference: ~10%
  • Missing: <5%

User Embedding Strategy

User Embedding Strategy

Multi-temporal weighted average with consistency boosting:

×ばつ 1.4]\n C2[Temporal Decay&lt;br/&gt;0.5^days/30]\n C3[Playcount Weight&lt;br/&gt;log1p]\n end\n\n subgraph Output[\"Output\"]\n direction TB\n D[128D Normalized&lt;br/&gt;User Vector]\n end\n\n Sources --&gt; Lookup\n Lookup --&gt; Weighting\n Weighting --&gt; Output\n B4 -.-&gt;|Cache Hit| B1\n\n style Sources fill:#e0f2fe,stroke:#0ea5e9,stroke-width:2px,color:#0c4a6e\n style Lookup fill:#ffe4e6,stroke:#f43f5e,stroke-width:2px,color:#881337\n style Weighting fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#14532d\n style Output fill:#f3e8ff,stroke:#a855f7,stroke-width:2px,color:#581c87\n \n style A1 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985\n style A2 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985\n style A3 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985\n style A4 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985\n style B1 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239\n style B2 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239\n style B3 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239\n style B4 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239\n style C1 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534\n style C2 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534\n style C3 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534\n style D fill:#e9d5ff,stroke:#9333ea,stroke-width:2px,color:#6b21a8\n"}" data-plain="graph LR subgraph Sources["Data Sources"] direction TB A1[Overall<br/>45%] A2[6 Months<br/>25%] A3[3 Months<br/>15%] A4[Recent 200<br/>15%] end subgraph Lookup["Lookup Strategy"] direction TB B1[Exact Match<br/>O1 - 60%] B2[Fuzzy FAISS<br/>~25%] B3[Zero-Shot<br/>~10%] B4[LRU Cache<br/>8K entries] end subgraph Weighting["Weighting Pipeline"] direction TB C1[Consistency Boost<br/>Multi-period ×ばつ 1.4] C2[Temporal Decay<br/>0.5^days/30] C3[Playcount Weight<br/>log1p] end subgraph Output["Output"] direction TB D[128D Normalized<br/>User Vector] end Sources --> Lookup Lookup --> Weighting Weighting --> Output B4 -.->|Cache Hit| B1 style Sources fill:#e0f2fe,stroke:#0ea5e9,stroke-width:2px,color:#0c4a6e style Lookup fill:#ffe4e6,stroke:#f43f5e,stroke-width:2px,color:#881337 style Weighting fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#14532d style Output fill:#f3e8ff,stroke:#a855f7,stroke-width:2px,color:#581c87 style A1 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985 style A2 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985 style A3 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985 style A4 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985 style B1 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239 style B2 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239 style B3 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239 style B4 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239 style C1 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534 style C2 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534 style C3 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534 style D fill:#e9d5ff,stroke:#9333ea,stroke-width:2px,color:#6b21a8 " dir="auto">
graph LR
 subgraph Sources["Data Sources"]
 direction TB
 A1[Overall<br/>45%]
 A2[6 Months<br/>25%]
 A3[3 Months<br/>15%]
 A4[Recent 200<br/>15%]
 end
 subgraph Lookup["Lookup Strategy"]
 direction TB
 B1[Exact Match<br/>O1 - 60%]
 B2[Fuzzy FAISS<br/>~25%]
 B3[Zero-Shot<br/>~10%]
 B4[LRU Cache<br/>8K entries]
 end
 subgraph Weighting["Weighting Pipeline"]
 direction TB
 C1[Consistency Boost<br/>Multi-period ×ばつ 1.4]
 C2[Temporal Decay<br/>0.5^days/30]
 C3[Playcount Weight<br/>log1p]
 end
 subgraph Output["Output"]
 direction TB
 D[128D Normalized<br/>User Vector]
 end
 Sources --> Lookup
 Lookup --> Weighting
 Weighting --> Output
 B4 -.->|Cache Hit| B1
 style Sources fill:#e0f2fe,stroke:#0ea5e9,stroke-width:2px,color:#0c4a6e
 style Lookup fill:#ffe4e6,stroke:#f43f5e,stroke-width:2px,color:#881337
 style Weighting fill:#dcfce7,stroke:#22c55e,stroke-width:2px,color:#14532d
 style Output fill:#f3e8ff,stroke:#a855f7,stroke-width:2px,color:#581c87
 
 style A1 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985
 style A2 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985
 style A3 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985
 style A4 fill:#bae6fd,stroke:#0284c7,stroke-width:2px,color:#075985
 style B1 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style B2 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style B3 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style B4 fill:#fecdd3,stroke:#e11d48,stroke-width:2px,color:#9f1239
 style C1 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534
 style C2 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534
 style C3 fill:#bbf7d0,stroke:#16a34a,stroke-width:2px,color:#166534
 style D fill:#e9d5ff,stroke:#9333ea,stroke-width:2px,color:#6b21a8
Loading

Fallback Hierarchy:

  1. Exact match → Precomputed embedding (O(1)) - ~60% coverage
  2. Fuzzy match → FAISS similarity search - ~25% coverage
  3. Zero-shot → Weighted average of similar artists - ~10% coverage
  4. LRU Cache → 8K entries, ~75% hit rate

Fallback Hierarchy:

  1. Exact match → Precomputed embedding (O(1)) - ~60% coverage
  2. Fuzzy match → FAISS similarity search - ~25% coverage
  3. Zero-shot → Weighted average of similar artists - ~10% coverage
  4. LRU Cache → 8K entries, ~75% hit rate

Scientific Foundation

Graph Neural Networks combine collaborative filtering with content-based features and relational structure. Each GNN layer aggregates neighbor information:

×ばつ degree_j)] ×ばつ e_j^(k)">
e_i^(k+1) = Σ(neighbors) [1/√(degree_i ×ばつ degree_j)] ×ばつ e_j^(k)

Final embedding averages all layers (local + global context).

References

  • He et al. (2020) - "LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation"
  • Rendle et al. (2009) - "BPR: Bayesian Personalized Ranking from Implicit Feedback"

Privacy & Compliance

  • Only public Last.fm API data
  • No scraping, respects rate limits
  • OAuth 1.0, no password storage
  • Anonymous embeddings (non-reversible)
  • GDPR compliant with right to deletion

License

MIT License

About

Deep learning music matcher that connects users based on complex listening patterns using GNN embeddings and Last.fm data.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

Contributors

AltStyle によって変換されたページ (->オリジナル) /