Stage 3: Intelligence Layer
Raw data only tells part of the story. A product might have 1,000 five-star reviews, but if half are from paid reviewers, that's not genuine love.
This stage adds three layers of AI-powered intelligence to every review.
The Three Models
| Model | What It Detects | Output |
|---|---|---|
| Quality Scoring | High-quality vs. low-quality reviews | Score 0-1 |
| Fake Detection | Suspicious/incentivized reviews | Probability 0-1 |
| Sentiment Analysis | Emotional tone of text | Score 0-1 |
Model A: Review Quality Scoring
The Problem
Not all reviews are equal. Compare these two 5-star reviews:
Review 1: "Great!"
Review 2: "I've been using this for 3 months now. My oily T-zone is finally under control without feeling tight. The texture is lightweight and absorbs quickly. Only downside is the price."
To a naive algorithm, both are "5 stars." But Review 2 is obviously more valuable.
How It Works
We train a machine learning model to predict which reviews are high-quality. The clever trick: we use incentivized reviews as a training signal.
Why? Incentivized reviews tend to be:
- More detailed (people feel obligated to write more for free stuff)
- More structured (they follow prompts)
- But also biased (they received something)
By learning to detect these patterns, the model also learns what makes a review substantive.
Features Used (11 total)
- Text length and word count
- Rating and recommendation
- Helpful votes received
- User's total review count
- User's average rating history
- Whether they included photos
The Output
Every review gets a quality score from 0 to 1. We also aggregate to product level:
- Average quality across all reviews
- Percentage of high-quality reviews
- Percentage of low-quality reviews
Model B: Fake Review Detection
The Problem
Some reviews are suspicious:
- Brand employees reviewing their own products
- Paid reviewers giving 5 stars for compensation
- Competitors leaving fake negative reviews
- Bot-generated reviews with generic text
A product with 500 reviews where 200 are fake is not as good as one with 300 genuine reviews.
How It Works
We use 68+ different signals to detect fake reviews:
61+ features across 8 categories
The Ensemble Approach
We combine two models for better accuracy:
-
Traditional Machine Learning — Fast, interpretable, good at numerical patterns (like "always gives 5 stars")
-
BERT Language Model — Understands text nuance, catches sophisticated fakes that write well but feel off
Final score: 50% traditional + 50% BERT
The Output
Every review gets a "fake probability" from 0 to 1:
- 0.0 = Almost certainly genuine
- 0.5 = Uncertain
- 1.0 = Almost certainly fake/incentivized
We don't label reviews as "fake" or "not fake." Reality is nuanced. A review might be genuine but low-quality, or incentivized but still honest. The probability lets downstream systems make their own decisions.
Model C: Sentiment Analysis
The Problem
Star ratings are blunt instruments. Compare:
4 stars: "This is actually my holy grail product, I'd give it 5 but I'm strict with ratings"
4 stars: "It's okay I guess, nothing special, might repurchase if on sale"
Both are 4 stars, but the sentiment is completely different.
How It Works
We use a pre-trained language model (DistilBERT fine-tuned on sentiment) that reads the review text and outputs a sentiment score.
The Output
- 0.0 = Very negative sentiment
- 0.5 = Neutral
- 1.0 = Very positive sentiment
When we combine sentiment with fake detection, we can find products where genuine reviewers express genuine enthusiasm — not just high star counts.
How the Models Work Together
Review Text ────────────────────┐
│
Rating & Metadata ──────────────┼───► Quality Model ───► quality_score
│
User History ───────────────────┤
│
All Features (68+) ─────────────┼───► Fake Detection ──► fake_probability
│ │
│ ├─► Traditional ML
│ └─► BERT (ensemble)
│
Text Only ──────────────────────┴───► Sentiment ───────► sentiment_scoreEach review now has three scores that travel with it to the ranking stage.
Training Data
For Quality & Fake Detection
We use the is_incentivized flag as a proxy label. Incentivized reviews are:
- Not necessarily "bad"
- But systematically different from organic reviews
- A learnable signal
For Sentiment
We use a pre-trained model (no custom training needed). DistilBERT was already fine-tuned on millions of sentences.
What's Next?
Three scores per review. Now we combine them into one number that captures genuine product love.
Next: Product Ranking → — The Love Score formula.