Smart Recommendations

Overview

Glean’s smart recommendation system is based on vector embedding technology. By learning your reading preferences, it calculates preference scores for articles to help you prioritize content you’re interested in.

How It Works

Vector Embeddings

When each article is ingested, the system automatically generates a vector representation (Embedding):

Extract article title and summary
Generate vector using Embedding model
Store in Milvus vector database

Preference Learning

The system learns preferences based on your feedback:

Action	Signal Weight	Description
👍 Like	+1.0	Explicit positive feedback
⭐ Bookmark	+0.7	Implicit positive feedback
👎 Dislike	-1.0	Explicit negative feedback

Preference Model

The system maintains a preference model for each user:

User Preference
├── Positive Vector (positive_embedding)  # Aggregated vector of liked content
├── Positive Weight (positive_count)      # Cumulative positive signals
├── Negative Vector (negative_embedding)  # Aggregated vector of disliked content
├── Negative Weight (negative_count)      # Cumulative negative signals
├── Source Affinity (source_affinity)     # Positive/negative stats per feed
└── Author Affinity (author_affinity)     # Positive/negative stats per author

Preference Scores

Calculation Method

Preference scores range from 0-100:

Score = (positive_similarity - negative_similarity + 1) / 2 × 100 × confidence + 50 × (1 - confidence)

Positive Similarity: Cosine similarity between article and positive preference vector
Negative Similarity: Cosine similarity between article and negative preference vector
Confidence: Based on feedback count, more feedback means higher confidence

New User Handling

When new users have no feedback data:

All articles default to score 50
Displayed sorted by time
Gradually personalized as feedback increases

Smart Recommendation View

In smart recommendation view mode, articles are displayed in layers by preference score:

Layering Rules

Layer	Score Range	Display	Sorting
📌 Recommended	≥ 70	Pinned at top	By score descending
📰 Normal	40-70	Normal display	By time descending
🔽 May Not Interest	< 40	Collapsed by default	By time descending

Threshold Customization

You can adjust layering thresholds in settings:

Setting	Default	Description
Recommendation Score Threshold	70	Above this score shows in recommended layer
Not Interested Score Threshold	40	Below this score shows in collapsed layer

Real-time Scoring

When you browse the article list, the system:

Retrieves article vector representations
Calculates similarity with your preference vectors
Computes preference scores in real-time
Displays articles in layers by score

Preference Updates

Incremental Updates

After each feedback, preference vectors are updated using incremental moving average:

# Simplified example
new_vector = (old_vector × old_weight + article_vector × signal_weight) / (old_weight + signal_weight)

This approach:

Avoids recalculating entire history
New feedback has greater impact
Old preferences gradually decay

Asynchronous Processing

Preference updates are executed asynchronously via background tasks, not affecting main operation response times.

Embedding Providers

The system supports multiple embedding providers:

sentence-transformers (Default)

Runs locally, no API key required:

EMBEDDING_PROVIDER=sentence-transformers
EMBEDDING_MODEL=all-MiniLM-L6-v2
EMBEDDING_DIMENSION=384

OpenAI

Use OpenAI Embedding API:

EMBEDDING_PROVIDER=openai
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMENSION=1536
EMBEDDING_API_KEY=sk-xxx

Volcengine

Use ByteDance’s Embedding service:

EMBEDDING_PROVIDER=volc-engine
EMBEDDING_MODEL=doubao-embedding
EMBEDDING_DIMENSION=1024
EMBEDDING_API_KEY=your-api-key

Usage Tips

Quick Cold Start

Actively give like/dislike feedback when browsing articles
First 10-20 feedbacks have the greatest impact on the model
Bookmarks also count as positive signals

Continuous Optimization

Regularly provide feedback to keep the model updated
Mark disliked content too, helps filter noise
If recommendations are inaccurate, check if you’ve given enough feedback

Next Steps

Admin Dashboard - System management features
Configuration - Embedding configuration details