HOUSE_OVERSIGHT_017013.jpg
2.3 MB
Extraction Summary
1
People
3
Organizations
0
Locations
2
Events
0
Relationships
3
Quotes
Document Information
Type:
Technical report / academic paper supplement (methodology section)
File Size:
2.3 MB
Summary
This document appears to be a methodology appendix for a study on 'Historical N-grams Corpora' utilizing Google Books data. It describes the technical process of filtering metadata to ensure accuracy, specifically removing serial publications via an algorithm dubbed 'Serial Killer.' The document bears a 'HOUSE_OVERSIGHT_017013' Bates stamp, indicating it was part of a document production to the House Oversight Committee, though the text itself contains no direct references to Epstein, Maxwell, or specific criminal activities.
People (1)
| Name | Role | Context |
|---|---|---|
| Annotator | Researcher/Verifier |
An individual with no knowledge of the study who manually determined date-of-publication for 1000 volumes.
|
Organizations (3)
| Name | Type | Context |
|---|---|---|
|
Digitized 15 million books used as the source for the study.
|
||
| US Government |
Mentioned in the context of 'US Government report' as a filter phrase.
|
|
| House Oversight Committee |
Documents bears the Bates stamp 'HOUSE_OVERSIGHT'.
|
Key Quotes (3)
"As noted in the paper text, we did not analyze the entire set of 15 million books digitized by Google."Source
HOUSE_OVERSIGHT_017013.jpg
Quote #1
"Our 'Serial Killer' algorithm removed serial publications by looking for suggestive metadata entries"Source
HOUSE_OVERSIGHT_017013.jpg
Quote #2
"For English books, 29.4% of books were filtered using the 'Serial Killer'"Source
HOUSE_OVERSIGHT_017013.jpg
Quote #3
Discussion 0
No comments yet
Be the first to share your thoughts on this epstein document