Arsitektur Sistem Rekomendasi

Content-Based Filtering dengan SBERT dan Cosine Similarity

πŸ“Š Diagram Alur Sistem

File GPX
Input
Preprocessing Cleaning, Lowercase
Data Processing
SBERT Embedding 384 dimensi
Feature Extraction
Cosine Similarity QΒ·D / (||Q||Γ—||D||)
Similarity
Top-N Ranking Rekomendasi
Output

1. Data Collection & Fusion

  • βœ“ Ekstraksi fitur dari GPX (jarak, elevasi, grade, durasi Naismith)
  • βœ“ Deskripsi manual (vegetasi, sumber air, panorama)
  • βœ“ Koordinat rute untuk visualisasi peta

2. Text Preprocessing

  • βœ“ Data Cleaning (regex: URL, karakter non-ASCII)
  • βœ“ Case Folding (lowercase)
  • βœ“ Stopword Removal Selektif (pertahankan negasi & kata sifat)
  • βœ— No Stemming (SBERT sensitif konteks)

3. SBERT Embedding

  • βœ“ Model: paraphrase-multilingual-MiniLM-L12-v2
  • βœ“ Dimensi embedding: 384
  • βœ“ Support bahasa Indonesia

4. Cosine Similarity

Sim(Q, D) = (Q Β· D) / (||Q|| Γ— ||D||)

  • Q = Query embedding (user search)
  • D = Document embedding (route)
  • Range = -1 to 1 (1 = identical)

πŸ› οΈ Technology Stack

🐘
Laravel 11
Backend Framework
🐍
Python 3.11
ML Processing
πŸ€–
SBERT
Sentence Transformers
πŸ—ΊοΈ
Leaflet.js
Map Visualization