The Questors

by Frederick Rustam

Part Two, THE DELIGHT OF BEING TOGETHER

If you haven't read the previous parts of this story, follow the links below:

Part One

_____________________________________________________________________

Questor Institute is a new, experimental technical school where

bright-but-poor high-school graduates on full scholarships spend

two years seeking to become wizards of Internet sorcery by studying

the science and philosophy of information retrieval from textual

databases such as the World Wide Web. In Part One, "A School for

Internet Sorcery," two students, Kevin and Marylou, have their

aptitudes tested, strike up a friendship, and attend the school's

first Assembly, where they're welcomed by the Rector in a speech

setting forth the unusual educational goals of Questor Institute.

_____________________________________________________________________

Co-occurrence, Correlation, Context

Kevin and Marylou had arrived early and were experimenting with their

high-speed workstations when the teacher arrived. He was a tweedy man

in his forties who wore a bowtie and parted his hair conspicuously

in the middle.

"I'll be giving you search examples from my personal experiences

in Internet searching, mostly from the Web," the teacher began.

"Use your computers to search my examples as I discuss them, if you

wish, but don't forget to pay attention to what I'm saying. Okay?

Some of the examples I discuss may seem trivial, especially by

comparison with the complex subjects that professional searchers

have to retrieve. But at this stage in your instruction, I prefer

to use curiosity-satisfying examples which are easier to understand

and which'll give us some enjoyment to pursue. You'll be searching

for hard-to-find subjects you've never heard of, soon enough.

"There're many search issues for us to deal with. Some will arise

as we proceed, but I'll be unable to pursue them right then, and

I'll say, 'We'll deal with that in detail, later.' If I stopped my

flow of instruction and sidetracked us into every new search issue

which reared its ugly head, I'd subvert your learning process.

"First, let's define some basic terms. A textword is a word

used in the text of a webpage or a Usenet posting. Textwords are

copied to create index words. A searchword is a word

we formulate in our minds to make a subject search, without knowing

for certain that it exists as a textword. Practically speaking then,

a searchword is a 'probable textword,' viewed from a searcher's

perspective. Because the Web and Usenet are such immense databases,

our searchwords will almost always be found as textwords somewhere

on the Internet.

"A term you'll often see on the Internet, in literature about the

Internet, and spoken by most people is 'keyword.' This term is

overused---like the much overused term, 'homepage.' As students

seeking to be Questors, you'll mostly avoid 'keyword,' except to

understand how others use it so that you can properly communicate

with them. A keyword is, broadly, a word which is a key to finding

information. It's our searchword of choice, it's the index word which

matches our searchword, and it's the textword we seek in a document.

People use 'keyword' to refer to all those things.”

The teacher smirked in anticipation of the forthcoming reactions

his students would have at his terminology.

"Three other terms I'll use are offered as jargon for us Questors.

'Gold' is a search-result item which is relevant to our search and

useful to our purpose.... 'Chicken feed' is a search result which

is technically relevant to our searchword but not useful for our

purposes. A mere mention of something we're seeking---that's the

usual textual form chicken feed takes.

"'Garbage' is a collective term for search-result items which

aren't at all relevant to our purposes, but which show up anyway.

"Let's begin our study of the AND operator with a nice sentiment:

'The Delight of Being Together.' This sentiment has served lovers

for countless generations of human existence. But it also serves

those of us who seek information from textual databases. When we put

together several searchwords, we hope to retrieve relevant text where

our words are found together in the same meaningful relationship they

were in our minds when we chose them for searching. The delight of

being together throughout the entire infotrieval process is not

easily experienced, though.

"There are three 'C's which we must understand: co-occurrence,

correlation, context. These are three fundamental realities

of retrieval using the AND operator, by far the most-often used

logical search operator. When we use several searchwords in most

search engines, our words are ANDed to each other by default---

that is, even if we don't actually type the word AND between them.

In this way, complex, more-specific search subjects are expressed

by using an increasing number of single words as building blocks,

just as natural language phrases are constructed from words.

"To illustrate these three realities, here's a retrieval situation

which sprang from one of my casual curiosities:

I heard a know-it-all radio talkshow host mention Charles

Martel, a medieval French leader, and he added as an aside,

'That's Charlemagne.' I thought he was wrong: Charles Martel

and Charlemagne (Charles the Great) weren't the same man.

How can I easily use the Web to prove my assumption?"

A student said, "Search either guy's name, and do a page-search for

the other name."

"Possible, but not quick enough. I could spend a lot of time checking

webpages about one man for a mention of the other. Let's construct a

logical word relationship before we search." He turned and wrote on

the easily-erasable whiteboard a search strategy:

"Use your computers now to make this exact search."

The students pounded on their keyboards. This is kid stuff,

thought Kevin. I know the point he’s making mused Mary Lou.

"When we search this way, what do we retrieve?... The results may

surprise you."

"Webpages with both names on them," offered a girl, who was reading

the search results as she spoke.

----------------------------------------------------------------------

Charles Martel - Wikipedia

... turned the tide of Islamic advance, and the unification of the

Frankish kingdom under Charles Martel, his son Pepin the short,

and his grandson Charlemagne ...

www.wikipedia.org/wiki/Charles_Martel - 12k - Cached -

The Questors

Similar pages