Content Analysis Summary

Content Optimized 15 Comparisons 2025-10-27T19:36:09.761Z

Processing Performance

15 Pages Processed
8s Avg Page Time
121ms Processing Speed

Content Analysis Algorithms

🎯

Jaccard Similarity

--
📐

Cosine Similarity

--
🔍

Fingerprint Match

--
🧠

Semantic Analysis

--
📚

Topic Modeling

--

Jaccard Similarity

Measures the similarity between two sets by dividing the intersection by the union. Perfect for comparing shared vs unique content elements. Higher scores indicate more shared content overlap.

Cosine Similarity

Calculates the cosine of the angle between two document vectors. Excellent for comparing text content regardless of document length. Values closer to 1 indicate very similar content themes.

Fingerprint Matching

Uses content hashing to detect exact or near-exact duplicate content blocks. Highly sensitive to copy-paste scenarios. Even small scores suggest potential plagiarism.

Semantic Analysis

Uses AI models to understand meaning and context beyond keywords. Detects paraphrased or rewritten content that maintains similar meaning. High scores indicate conceptual similarity.

Topic Modeling

Identifies underlying topics and themes across content. Groups content by subject matter similarity. Higher scores indicate sites covering similar topic areas or categories.

Content Similarity Analysis

int_20251027_1934_vs_tf_20251027_1935

50% Overall Similarity
𝒊 Jaccard 41%
𝒊 Cosine 78%
𝒊 Fingerprint 0%
𝒊 Semantic 71%
𝒊 Topic 55%

bv_20251027_1934_vs_tf_20251027_1935

36% Overall Similarity
𝒊 Jaccard 19%
𝒊 Cosine 91%
𝒊 Fingerprint 0%
𝒊 Semantic 22%
𝒊 Topic 45%

bv_20251027_1934_vs_int_20251027_1934

35% Overall Similarity
𝒊 Jaccard 19%
𝒊 Cosine 84%
𝒊 Fingerprint 0%
𝒊 Semantic 16%
𝒊 Topic 65%

bv_20251027_1934_vs_spokeo_20251027_1935

32% Overall Similarity
𝒊 Jaccard 15%
𝒊 Cosine 81%
𝒊 Fingerprint 0%
𝒊 Semantic 24%
𝒊 Topic 30%

spokeo_20251027_1935_vs_tf_20251027_1935

32% Overall Similarity
𝒊 Jaccard 13%
𝒊 Cosine 79%
𝒊 Fingerprint 0%
𝒊 Semantic 20%
𝒊 Topic 45%

int_20251027_1934_vs_spokeo_20251027_1935

26% Overall Similarity
𝒊 Jaccard 13%
𝒊 Cosine 66%
𝒊 Fingerprint 0%
𝒊 Semantic 17%
𝒊 Topic 30%

icm_20251027_1934_vs_tf_20251027_1935

14% Overall Similarity
𝒊 Jaccard 16%
𝒊 Cosine 13%
𝒊 Fingerprint 0%
𝒊 Semantic 26%
𝒊 Topic 15%

icm_20251027_1934_vs_int_20251027_1934

11% Overall Similarity
𝒊 Jaccard 11%
𝒊 Cosine 11%
𝒊 Fingerprint 0%
𝒊 Semantic 29%
𝒊 Topic 0%

bv_20251027_1934_vs_icm_20251027_1934

10% Overall Similarity
𝒊 Jaccard 12%
𝒊 Cosine 12%
𝒊 Fingerprint 0%
𝒊 Semantic 8%
𝒊 Topic 20%

icm_20251027_1934_vs_spokeo_20251027_1935

8% Overall Similarity
𝒊 Jaccard 11%
𝒊 Cosine 11%
𝒊 Fingerprint 0%
𝒊 Semantic 6%
𝒊 Topic 10%

bv_20251027_1934_vs_truepeoplesearch_20251027_1935

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

icm_20251027_1934_vs_truepeoplesearch_20251027_1935

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

int_20251027_1934_vs_truepeoplesearch_20251027_1935

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

spokeo_20251027_1935_vs_truepeoplesearch_20251027_1935

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

tf_20251027_1935_vs_truepeoplesearch_20251027_1935

0% Overall Similarity
𝒊 Jaccard 0%
𝒊 Cosine 0%
𝒊 Fingerprint 0%
𝒊 Semantic 0%
𝒊 Topic 0%

Content Analysis Insights

🚨 Risk Assessment

🔴 -- Critical Similarity
🟠 -- High Concern
🟡 -- Moderate Risk
🟢 -- Acceptable Difference

📊 Algorithm Performance

Most Sensitive: Cosine
Least Sensitive: Fingerprint
Best Detector: Semantic

🔍 Content Patterns

InstantCheckmate vs TruthFinder show highest similarity
TruePeopleSearch appears most unique
3 sites form a similarity cluster

💡 Recommendations

CRITICAL Investigate high-similarity pairs for potential copying
HIGH Review semantic matches for paraphrased content
MEDIUM Monitor cosine scores for content theme overlap