3. Intelligence Layer

Stage 3: Intelligence Layer

Raw data only tells part of the story. A product might have 1,000 five-star reviews, but if half are from paid reviewers, that's not genuine love.

This stage adds three layers of AI-powered intelligence to every review.

The Three Models

ModelWhat It DetectsOutput
Quality ScoringHigh-quality vs. low-quality reviewsScore 0-1
Fake DetectionSuspicious/incentivized reviewsProbability 0-1
Sentiment AnalysisEmotional tone of textScore 0-1

Model A: Review Quality Scoring

The Problem

Not all reviews are equal. Compare these two 5-star reviews:

Review 1: "Great!"

Review 2: "I've been using this for 3 months now. My oily T-zone is finally under control without feeling tight. The texture is lightweight and absorbs quickly. Only downside is the price."

To a naive algorithm, both are "5 stars." But Review 2 is obviously more valuable.

How It Works

We train a machine learning model to predict which reviews are high-quality. The clever trick: we use incentivized reviews as a training signal.

Why? Incentivized reviews tend to be:

  • More detailed (people feel obligated to write more for free stuff)
  • More structured (they follow prompts)
  • But also biased (they received something)

By learning to detect these patterns, the model also learns what makes a review substantive.

Features Used (11 total)

  • Text length and word count
  • Rating and recommendation
  • Helpful votes received
  • User's total review count
  • User's average rating history
  • Whether they included photos

The Output

Every review gets a quality score from 0 to 1. We also aggregate to product level:

  • Average quality across all reviews
  • Percentage of high-quality reviews
  • Percentage of low-quality reviews

Model B: Fake Review Detection

The Problem

Some reviews are suspicious:

  • Brand employees reviewing their own products
  • Paid reviewers giving 5 stars for compensation
  • Competitors leaving fake negative reviews
  • Bot-generated reviews with generic text

A product with 500 reviews where 200 are fake is not as good as one with 300 genuine reviews.

How It Works

We use 68+ different signals to detect fake reviews:

61+ features across 8 categories

The Ensemble Approach

We combine two models for better accuracy:

  1. Traditional Machine Learning — Fast, interpretable, good at numerical patterns (like "always gives 5 stars")

  2. BERT Language Model — Understands text nuance, catches sophisticated fakes that write well but feel off

Final score: 50% traditional + 50% BERT

The Output

Every review gets a "fake probability" from 0 to 1:

  • 0.0 = Almost certainly genuine
  • 0.5 = Uncertain
  • 1.0 = Almost certainly fake/incentivized
🔍Not Binary

We don't label reviews as "fake" or "not fake." Reality is nuanced. A review might be genuine but low-quality, or incentivized but still honest. The probability lets downstream systems make their own decisions.


Model C: Sentiment Analysis

The Problem

Star ratings are blunt instruments. Compare:

4 stars: "This is actually my holy grail product, I'd give it 5 but I'm strict with ratings"

4 stars: "It's okay I guess, nothing special, might repurchase if on sale"

Both are 4 stars, but the sentiment is completely different.

How It Works

We use a pre-trained language model (DistilBERT fine-tuned on sentiment) that reads the review text and outputs a sentiment score.

The Output

  • 0.0 = Very negative sentiment
  • 0.5 = Neutral
  • 1.0 = Very positive sentiment
Combined Power

When we combine sentiment with fake detection, we can find products where genuine reviewers express genuine enthusiasm — not just high star counts.


How the Models Work Together

Review Text ────────────────────┐

Rating & Metadata ──────────────┼───► Quality Model ───► quality_score

User History ───────────────────┤

All Features (68+) ─────────────┼───► Fake Detection ──► fake_probability
                                │         │
                                │         ├─► Traditional ML
                                │         └─► BERT (ensemble)

Text Only ──────────────────────┴───► Sentiment ───────► sentiment_score

Each review now has three scores that travel with it to the ranking stage.


Training Data

For Quality & Fake Detection

We use the is_incentivized flag as a proxy label. Incentivized reviews are:

  • Not necessarily "bad"
  • But systematically different from organic reviews
  • A learnable signal

For Sentiment

We use a pre-trained model (no custom training needed). DistilBERT was already fine-tuned on millions of sentences.


What's Next?

Three scores per review. Now we combine them into one number that captures genuine product love.

Next: Product Ranking → — The Love Score formula.