Book Page / Legal Discovery Document - HOUSE_OVERSIGHT_016373

Processing Document...

0%

Initializing...

Extraction Summary

6

People

1

Organizations

1

Locations

0

Events

2

Relationships

3

Quotes

Document Information

Type: Book page / legal discovery document

File Size:

Summary

The text discusses the philosophical and psychological debate between bottom-up learning (association and pattern detection) and top-down learning (using abstract concepts and hypotheses). It illustrates these concepts using the analogy of filtering spam emails, contrasting machine learning pattern recognition with human reasoning based on background knowledge.

People (6)

Name	Role	Context
J. S. Mill
Pavlov
B. F. Skinner
Plato
Descartes
Noam Chomsky

Organizations (1)

Name	Type	Context
Journal of Clinical Biology

Locations (1)

Location	Context
Nigeria

Relationships (2)

→ → to

Key Quotes (3)

"Over time, there has been a seesaw between this bottom-up approach to the mystery of learning and Plato’s alternative, top-down one."

Source

HOUSE_OVERSIGHT_016373.jpg

Quote #1

"The computer can extract the pattern of features that distinguishes the two, even if it’s quite subtle."

Source

HOUSE_OVERSIGHT_016373.jpg

Quote #2

"But by using what I already know, and thinking in an abstract way about the process that produces spam, I can figure out that this email is suspicious."

Source

HOUSE_OVERSIGHT_016373.jpg

Quote #3

Full Extracted Text

Complete text extracted from the document (3,548 characters)

and J. S. Mill and later by behavioral psychologists, like Pavlov and B. F. Skinner. On
this view, the abstractness and hierarchical structure of representations is something of an
illusion, or at least an epiphenomenon. All the work can be done by association and
pattern detection—especially if there are enough data.
Over time, there has been a seesaw between this bottom-up approach to the
mystery of learning and Plato’s alternative, top-down one. Maybe we get abstract
knowledge from concrete data because we already know a lot, and especially because we
already have an array of basic abstract concepts, thanks to evolution. Like scientists, we
can use those concepts to formulate hypotheses about the world. Then, instead of trying
to extract patterns from the raw data, we can make predictions about what the data should
look like if those hypotheses are right. Along with Plato, such “rationalist” philosophers
and psychologists as Descartes and Noam Chomsky took this approach.
Here’s an everyday example that illustrates the difference between the two
methods: solving the spam plague. The data consist of a long unsorted list of messages in
your in-box. The reality is that some of these messages are genuine and some are spam.
How can you use the data to discriminate between them?
Consider the bottom-up technique first. You notice that the spam messages tend
to have particular features: a long list of addressees, origins in Nigeria, references to
million-dollar prizes or Viagra. The trouble is that perfectly useful messages might have
these features, too. If you looked at enough examples of spam and non-spam emails, you
might see not only that spam emails tend to have those features but that the features tend
to go together in particular ways (Nigeria plus a million dollars spells trouble). In fact,
there might be some subtle higher-level correlations that discriminate the spam messages
from the useful ones—a particular pattern of misspellings and IP addresses, say. If you
detect those patterns, you can filter out the spam.
The bottom-up machine-learning techniques do just this. The learner gets
millions of examples, each with some set of features and each labeled as spam (or some
other category) or not. The computer can extract the pattern of features that distinguishes
the two, even if it’s quite subtle.
How about the top-down approach? I get an email from the editor of the Journal
of Clinical Biology. It refers to one of my papers and says that they would like to publish
an article by me. No Nigeria, no Viagra, no million dollars; the email doesn’t have any
of the features of spam. But by using what I already know, and thinking in an abstract
way about the process that produces spam, I can figure out that this email is suspicious.
(1) I know that spammers try to extract money from people by appealing to
human greed.
(2) I also know that legitimate “open access” journals have started covering their
costs by charging authors instead of subscribers, and that I don’t practice anything like
clinical biology.
Put all that together and I can produce a good new hypothesis about where that
email came from. It’s designed to sucker academics into paying to “publish” an article in
a fake journal. The email was a result of the same dubious process as the other spam
emails, even though it looked nothing like them. I can draw this conclusion from just one
example, and I can go on to test my hypothesis further, beyond anything in the email
itself, by googling the “editor.”

153

HOUSE_OVERSIGHT_016373

View Original PDF

HOUSE_OVERSIGHT_016373.jpg

Processing Document...

Extraction Summary

Document Information

People (6)

Organizations (1)

Locations (1)

Relationships (2)

Key Quotes (3)

Full Extracted Text

Discussion 0