Content Analysis Summary

Content Optimized 15 Comparisons 2025-10-28T16:46:36.597Z

Processing Performance

15 Pages Processed
10s Avg Page Time
101ms Processing Speed

Content Analysis Algorithms

🎯

Jaccard Similarity

--
📐

Cosine Similarity

--
🔍

Fingerprint Match

--
🧠

Semantic Analysis

--
📚

Topic Modeling

--

Jaccard Similarity

Measures the similarity between two sets by dividing the intersection by the union. Perfect for comparing shared vs unique content elements. Higher scores indicate more shared content overlap.

Cosine Similarity

Calculates the cosine of the angle between two document vectors. Excellent for comparing text content regardless of document length. Values closer to 1 indicate very similar content themes.

Fingerprint Matching

Uses content hashing to detect exact or near-exact duplicate content blocks. Highly sensitive to copy-paste scenarios. Even small scores suggest potential plagiarism.

Semantic Analysis

Uses AI models to understand meaning and context beyond keywords. Detects paraphrased or rewritten content that maintains similar meaning. High scores indicate conceptual similarity.

Topic Modeling

Identifies underlying topics and themes across content. Groups content by subject matter similarity. Higher scores indicate sites covering similar topic areas or categories.

Content Similarity Analysis

instantcheckmate_com_20251028_1644_vs_truthfinder_com_20251028_1645

58% Overall Similarity
𝒊 Jaccard 55%
𝒊 Cosine 90%
𝒊 Fingerprint 0%
𝒊 Semantic 81%
𝒊 Topic 55%

intelius_com_20251028_1645_vs_truthfinder_com_20251028_1645

50% Overall Similarity
𝒊 Jaccard 41%
𝒊 Cosine 78%
𝒊 Fingerprint 0%
𝒊 Semantic 71%
𝒊 Topic 55%

instantcheckmate_com_20251028_1644_vs_intelius_com_20251028_1645

48% Overall Similarity
𝒊 Jaccard 44%
𝒊 Cosine 75%
𝒊 Fingerprint 0%
𝒊 Semantic 63%
𝒊 Topic 50%

instantcheckmate_com_20251028_1644_vs_spokeo_com_20251028_1645

37% Overall Similarity
𝒊 Jaccard 16%
𝒊 Cosine 85%
𝒊 Fingerprint 0%
𝒊 Semantic 23%
𝒊 Topic 65%

beenverified_com_20251028_1644_vs_truthfinder_com_20251028_1645

36% Overall Similarity
𝒊 Jaccard 20%
𝒊 Cosine 91%
𝒊 Fingerprint 0%
𝒊 Semantic 22%
𝒊 Topic 45%

beenverified_com_20251028_1644_vs_instantcheckmate_com_20251028_1644

35% Overall Similarity
𝒊 Jaccard 20%
𝒊 Cosine 87%
𝒊 Fingerprint 0%
𝒊 Semantic 20%
𝒊 Topic 45%

beenverified_com_20251028_1644_vs_intelius_com_20251028_1645

35% Overall Similarity
𝒊 Jaccard 18%
𝒊 Cosine 84%
𝒊 Fingerprint 0%
𝒊 Semantic 16%
𝒊 Topic 65%

beenverified_com_20251028_1644_vs_spokeo_com_20251028_1645

32% Overall Similarity
𝒊 Jaccard 15%
𝒊 Cosine 81%
𝒊 Fingerprint 0%
𝒊 Semantic 24%
𝒊 Topic 30%

spokeo_com_20251028_1645_vs_truthfinder_com_20251028_1645

32% Overall Similarity
𝒊 Jaccard 13%
𝒊 Cosine 79%
𝒊 Fingerprint 0%
𝒊 Semantic 20%
𝒊 Topic 45%

intelius_com_20251028_1645_vs_spokeo_com_20251028_1645

26% Overall Similarity
𝒊 Jaccard 14%
𝒊 Cosine 66%
𝒊 Fingerprint 0%
𝒊 Semantic 17%
𝒊 Topic 30%

beenverified_com_20251028_1644_vs_truepeoplesearch_com_20251028_1645

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

instantcheckmate_com_20251028_1644_vs_truepeoplesearch_com_20251028_1645

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

intelius_com_20251028_1645_vs_truepeoplesearch_com_20251028_1645

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

spokeo_com_20251028_1645_vs_truepeoplesearch_com_20251028_1645

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

truepeoplesearch_com_20251028_1645_vs_truthfinder_com_20251028_1645

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

Content Analysis Insights

🚨 Risk Assessment

🔴 -- Critical Similarity
🟠 -- High Concern
🟡 -- Moderate Risk
🟢 -- Acceptable Difference

📊 Algorithm Performance

Most Sensitive: Cosine
Least Sensitive: Fingerprint
Best Detector: Semantic

🔍 Content Patterns

InstantCheckmate vs TruthFinder show highest similarity
TruePeopleSearch appears most unique
3 sites form a similarity cluster

💡 Recommendations

CRITICAL Investigate high-similarity pairs for potential copying
HIGH Review semantic matches for paraphrased content
MEDIUM Monitor cosine scores for content theme overlap