HOUSE_OVERSIGHT_017013.jpg

2.3 MB
View Original

Extraction Summary

1
People
3
Organizations
0
Locations
2
Events
0
Relationships
3
Quotes

Document Information

Type: Technical report / academic paper supplement (methodology section)
File Size: 2.3 MB
Summary

This document appears to be a methodology appendix for a study on 'Historical N-grams Corpora' utilizing Google Books data. It describes the technical process of filtering metadata to ensure accuracy, specifically removing serial publications via an algorithm dubbed 'Serial Killer.' The document bears a 'HOUSE_OVERSIGHT_017013' Bates stamp, indicating it was part of a document production to the House Oversight Committee, though the text itself contains no direct references to Epstein, Maxwell, or specific criminal activities.

People (1)

Name Role Context
Annotator Researcher/Verifier
An individual with no knowledge of the study who manually determined date-of-publication for 1000 volumes.

Organizations (3)

Name Type Context
Google
Digitized 15 million books used as the source for the study.
US Government
Mentioned in the context of 'US Government report' as a filter phrase.
House Oversight Committee
Documents bears the Bates stamp 'HOUSE_OVERSIGHT'.

Timeline (2 events)

1550-2008
Analysis period for n-gram frequency tables.
Global (English and foreign language corpora)
1801-2000
Metadata accuracy examination of 1000 filtered volumes.
N/A

Key Quotes (3)

"As noted in the paper text, we did not analyze the entire set of 15 million books digitized by Google."
Source
HOUSE_OVERSIGHT_017013.jpg
Quote #1
"Our 'Serial Killer' algorithm removed serial publications by looking for suggestive metadata entries"
Source
HOUSE_OVERSIGHT_017013.jpg
Quote #2
"For English books, 29.4% of books were filtered using the 'Serial Killer'"
Source
HOUSE_OVERSIGHT_017013.jpg
Quote #3

Discussion 0

Sign in to join the discussion

No comments yet

Be the first to share your thoughts on this epstein document