This document is a page of supplementary references for a scientific paper titled 'Quantitative analysis of culture using millions of digitized books' by Michel et al. (likely the 2011 'Culturomics' paper). It lists technical citations related to OCR technology, data processing (MapReduce), and information quantification. The document bears a 'HOUSE_OVERSIGHT' Bates stamp, indicating it was part of evidence collected during a congressional investigation, likely related to Jeffrey Epstein's connections to scientific research and funding.
| Name | Role | Context |
|---|---|---|
| L. Taycher | Author |
Cited in reference S1 regarding Google Books
|
| Ray Smith | Author |
Cited in reference S2 regarding Tesseract OCR
|
| Daria Antonova | Author |
Cited in reference S2
|
| Dar-Shyang Lee | Author |
Cited in reference S2
|
| Ashok Popat | Author |
Cited in reference S3 regarding anomalous text detection
|
| Thorsten Brants | Author |
Cited in reference S4
|
| Alex Franz | Author |
Cited in reference S4
|
| Jeffrey Dean | Author |
Cited in reference S5 regarding MapReduce
|
| Sanjay Ghemawat | Author |
Cited in reference S5 regarding MapReduce
|
| Peter Lyman | Author |
Cited in reference S6
|
| Hal R. Varian | Author |
Cited in reference S6
|
| Christian Bizer | Author |
Cited in reference S9 regarding DBpedia
|
| Jens Lehmann | Author |
Cited in reference S9
|
| Georgi Kobilarov | Author |
Cited in reference S9
|
| Sören Auer | Author |
Cited in reference S9
|
| Christian Becker | Author |
Cited in reference S9
|
| Richard Cyganiak | Author |
Cited in reference S9
|
| Sebastian Hellmann | Author |
Cited in reference S9
|
| Michel et al. | Primary Author |
Author of the main paper 'Quantitative analysis of culture using millions of digitized books' for which this document...
|
| Name | Type | Context |
|---|---|---|
| ACM |
Association for Computing Machinery, publisher of cited proceedings
|
|
| LDC |
Linguistic Data Consortium, publisher of cited dataset
|
|
| University of California, Berkeley |
Associated with reference S6 URL
|
|
| Wikipedia |
Cited in references S7, S8, S10
|
|
| House Oversight Committee |
Source of the document (Footer: HOUSE_OVERSIGHT_017038)
|
| Location | Context |
|---|---|
|
Location of the International Conference on Multilingual OCR mentioned in S2
|
"Quantitative analysis of culture using millions of digitized books"Source
"Books of the world stand up and be counted"Source
"Adapting the Tesseract open source OCR engine for multilingual OCR"Source
"MapReduce: Simplified Data Processing on Large Clusters"Source
"How Much Information"Source
Complete text extracted from the document (1,559 characters)
Discussion 0
No comments yet
Be the first to share your thoughts on this epstein document