The Questors

by Frederick Rustam



Questor Institute is a new, experimental technical school where

bright-but-poor high-school graduates on full scholarships spend

two years seeking to become wizards of Internet sorcery by studying

the science and philosophy of information retrieval from textual

databases such as the World Wide Web. In Part Three, "The Nearness

of You," Kevin and Marylou studied the tricky process of subject

qualification, learned the greater effectiveness of proximity

searching with the NEAR logical operator, and began a search

for an inspiring mountain.



Information Availability


Kevin and Marylou's studies at Questor Institute were centered on the

technology of Internet textword searching. The Institute also taught

them the historical and philosophical foundations of this primary

medium from which they retrieved information. Most of their exposure

to the Institute's Internet philosophy occurred in the course,

History and Nature of Computerized Information Systems.


"Today, we're gonna get orientated 'bout the 'Net," joked Kevin.


"Oriented," corrected Marylou, not allowing for his humor. She pointed

to the course outline displayed on their workstation screens. "Today's

lecture is supposed to be a brief introduction to Internet history

and philosophy."


"The universal Internet is only a few years old," began the teacher,

who was the Institute's elderly Rector. "Yet it's already become

the most significant instrumentality for the storage and retrieval

of publicly-available information since the advent of the public

library. We all assume that technology drives and steers society

to some degree. Printing helped to change medieval society and bring

about the Renaissance by making more-diverse literature more-widely

available. Internet technology is changing today's global society by

making public information from a wide variety of sources and viewpoints

universally-available, and by making that information more-easily



"The Internet was not, as our media often say, devised by the Dept.

of Defense for military telecommunication after a nuclear strike on

our nation. The TCP/IP packet-switching standard---Transmission Control

Protocol/Internet Protocol, and the ARPAnet to make use of it---were

devised so that universities doing defense research sponsored by the

Advanced Research Projects Agency could quickly share their technical

papers and computer programs.


"Later, other government agencies and commercial outfits built similar

national data transmission networks, and for awhile it was uncertain

which technical networking standard would triumph over the others.

When the Europeans saw that TCP/IP might become the de facto

standard for an international network, they rushed to create their

own version and got that adopted by the International Standards

Organization---but too late. TCP/IP became the world's Internet

protocol because people wanted an international computer network

right now, and they chose to use the functioning ARPA

standard, instead of waiting for something better to develop.


"It wasn't until the National Science Foundation built a civilian

network on the ARPAnet standard that a national network for all our

citizens became possible. When the NSF relinquished operation of its

NSFnet to a three-level combination of long-distance telecom companies,

regional telecom companies, and local, commercial Internet service

providers, the national citizen's computer network we know as the

Internet was born. It's important for us to appreciate this fact: it

wasn't until the government transferred one of its taxpayer-financed,

civilian computer networks to free-enterprise operation that a true

Internet became available for all our citizens. Our current Internet

wasn't built by the Defense Department as a gift for the general

public, although that's what some civilian visionaries in ARPA

hoped their network would eventually become.


The Rector tapped his head as if to elicit thought by his audience.


"But what is our Internet, really?... It's much more than a bunch of

telephone lines and routers. It's a bunch of agreements, agreements

between millions of information holders to use the same methods of

formatting and transmitting information and to make that information

available everyone who can access it. The earliest of the info-

sharing technical agreements were email and 'FTP.' File Transfer

Protocol allowed users anywhere on the network to transfer their

data files and computer programs to others by using the same

simple procedures.


"Then came 'telnet,' a protocol which allowed all Internet users

to retrieve data by remotely manipulating the databases of other

users. Such a database might be the computerized catalog of a big

university library, a place where bibliographic data about rare

books might be found. This general database-sharing protocol opened

a big door to information retrieval: the acquisition of information,

routinely and without formal request. We didn't have to ask somebody

for their information; we just connected to them on the network and

took what we wanted---and we did so because the holders of that info

wanted us to have it. It was this new, free-information mode which

established the spirit of today's Internet.


"To retrieve information by telnet, however, we have to know the

technical procedures by which individual info-holders operate their

databases. These operating procedures vary, often in slightly-but-

annoyingly-different ways; yet they have to be employed correctly,

or we can't retrieve the information."



Universal Compatibility


"Then a researcher had a terrific idea. A British physicist, Tim

Berners-Lee---while working at CERN, a European research facility---

came up with a superprotocol for information-sharing on the Internet.

He called it 'HTTP, HyperText Transfer Protocol.' It's an agreement

by all Internet users to format much their Internet information in a

standard fashion and allow access to it by a commonly-employed means

of retrieval, transmission, and display. This protocol allowed the

establishment of the World Wide Web.


"HTTP also provided for site-to-site hyperlinks embedded within the

text of webpages. These links offer Web users an important way to

discover more information about a subject than they can find on the

pages of the website they're currently viewing. The importance of

hyperlinks can't be overstressed. Following them can be like following

guidemarks on the walls of a dimly-lighted tunnel system onward to the

liberating sunlight.


"The advantages of a single Web-document protocol have since been

undermined by the proliferation of webpages in formats requiring

specialized software to view them. But the HTML format does have

limitations, and these can now be sidestepped by those document

authors who prefer more-versatile formats for their works while

making these works publicly-available via standard Web browsers.


"In March 1995, a milestone was reached when Internet Web traffic

first exceeded FTP traffic. HTTP was a quantum jump for the Internet

and a giant leap for humankind---one far-greater than traveling to

the moon, I assure you. It was HTTP that changed the Internet from

a techno-tool for an elite to an information system for the masses.

When HTTP spawned the World Wide Web, and commercial firms offered

the common folk access to it, the whole world went online to post

information and to retrieve the information of others.


"Wow," scoffed Kevin, too loudly. Marylou elbowed him. "Shhhhh."


"The Web is, however, a relatively 'freeform' database, in that its

only organization---if we can term it that---is that it exists on

webpages. Its information isn't naturally classified or grouped like

books in a library or want ads in a newspaper, where similar items

are gathered together for retrieval convenience. Nor are the Web's

webpages placed into neatly-packaged information containers like

those of structured databases, where categorized tables of data

elements can be retrieved individually or in combination.... But

this doesn't mean that Web information is almost-irretrievable,

as some critics have recklessly claimed.


"Millions of the world's people are accessing the Web, and they're

delighted---but frustrated, as well---by that vast information system.

Consider why this is so. In a library, when you want information about

a person, you can consult a general encyclopedia, a biographical

reference work, or a specific biography---depending upon the depth of

info you require. When you want in-depth information about any specific

subject, you can go to a place on the shelves where books about that

subject are located. Use of the library's subject catalog is desirable

but not always necessary.


"But on the Internet---unless you can directly access or follow a

hyperlink to another webpage of known relevance---you face the

terabytes of a vast, distributed megacorpus of text, universal in

its subject coverage, in which your desired subject is completely

scattered among billions of webpages in no order, and which is

retrievable only by means of textword indexes which more-or-less

tell you where subjects can be found---if you can successfully

search those indexes.


"There are some specialized search engines available which index

only webpages within a narrow area of specialization. But most Web

users prefer to use general search engines. These engines index

many more pages, they're updated more frequently, and they usually

offer more search features. This preference for general engines,

however, virtually guarantees users more retrieval frustration.


"The daunting challenge which the World Wide Web and the Usenet offer

information seekers is the reason for the establishment of Questor

Institute, and it's why you're here. In meeting this challenge

successfully, you'll find that you've often retrieved useful info

from those of your fellow citizens often labeled as 'amateurs.'"



Citizen Scholars


"There are many Usenet newsgroups; and there are, in effect, several

Webs. There is the publicized and well-known Commercial Web, where

people are trying to make money. The Academic Web now educates many

more people than registered college students by making non-confidential

academic documents universally available. The Government Web is where

Big Brother interacts with the Internet---not to control it but to

merely use it as the rest of us do."


As he intoned this reality, the Rector rolled his eyes skyward and

clasped his hands in mock-prayer. "Thank you, Lord, for this blessed

decentralization of the People's Medium." The audience applauded.


"There's the Media Web, a means for our news/feature media to extend

their influence over us, even into a place where we like to think that

we've escaped their traditional, one-way communication. The Media Web

is a part of the Commercial Web, although it hemorrhages money at an

enormous rate. Nonetheless, the media strongly feel that they have to

have a Web presence. And that's good for information seekers because

media websites are a valuable source of timely information---but not

necessarily 'authoritative' information. Media error seems to be

amplified by the modalities of the Digital Age.


"But most of the Web---nobody can say for sure how much---is the

province of ordinary citizens. You've noticed this, I trust. The

Citizen Web is composed of millions of websites created by ordinary

people like you and me. After the government surrendered the data

transmission infrastructure of the Internet to commercial telecom

firms, citizens who were not 'empowered' by colleges, governments,

or corporations---or favored by traditional publishers---could

more-easily contribute their information to society's info-pool.

And they did. The first trickle of citizen participation quickly

became a flood which is quietly sweeping our old informational

hierarchies into the drainpipe of history."


The Rector assumed a conspiratorial stance. "But don't tell this

to the Information Elite. They still believe themselves to be the

sole creators and disseminators of of authoritative information."


Kevin whispered to Marylou, "I don't even have a homepage with an

authoritative picture of my cat."


"You will when you get a cat," she purred. "They're so lovable

we just have to tell everybody about them."


As if he'd heard that conversation, the teacher continued, "What

do ordinary citizens have to offer the Web that isn't offered by

academia, government, and commerce?... The same thing, really---

information---but without the customary authority those societal

institutions claim for themselves. Citizens have put up on the Web

billions of pages of text, sound, and graphics in which they have

an interest. I'm not just talking about homely 'homepages,' even

though personal websites can be useful sources of information.


"Many citizen websters are hobbyists who pursue their hobbies online

as well as offline. Hobbyist websites are often quite elaborate, and

their pages speak with an authority which most media pages lack because

the info comes from the heart as well as the head. 'Interest websites,'

those which reflect the interests of their entrepreneurs, are also

info-valuable. Even 'fansites,' where citizen fans exhibit their

enthusiasm for celebrities, can be useful sources of information.


"I once heard a popular radio musicologist introduce a medieval

selection as what sounded to me like 'Lo Mar May.' I was curious

enough to access the musicologist's website to find this title in his

program playlist. Although most of his numbered programs were listed,

a few were not, and #109 was one of those missing! For months, again

and again, I accessed the page to see if Program #109 had been added.

It hadn't been. Finally, I decided to 'go general,' that is, to make

a general search of the Web for another source of this guy's program

playlists. I did, and I quickly found a citizen's fansite about the

musicologist which had them. Checking Program #109, I found that the

title the musicologist had spoken was the French phrase, 'l'Homme

Arme'---'a-r-m-e' with an acute accent on the e---'The Armed Man.'

If that citizen website hadn't existed, I might still be waiting to

find out how 'Lo Mar May' was spelled. Such redundancy on the Web

is a good thing for information seekers.


"Our media, both offline and online, rarely refer to the Citizen Web.

When they do, it's usually to denigrate some part of it. Media folks

seem to feel they have to label 'amateur' Web information as being

unreliable. For a few years, the media relentlessly reported the

Internet as unreliable and dangerous. They did this because they

feared the competition this new medium presented. Newspapers feared

losing readers, television feared losing viewers, and these media

tried hard to stigmatize the new People's Medium---a medium which

they barely understood, and which today they still poorly know.


The Rector's hard, calculated look of contempt changed into a

wicked grin of pleasure.


"But when the media rushed onto the Web to gain a presence there,

their scare stories about the Internet greatly diminished. When

editors and reporters realized the value of the People's Medium

for their purposes, they became part of the Web community---but,

thank goodness, not a controlling part. Then, they realized they

couldn't write or say bad things about the Internet which might

discourage the computer-owning, educated class from accessing

media websites. But the Media Elite still keeps its collective

nose elevated well above the Citizen Web, even as they utilize

its information freely and often without attribution.


"So why do citizens spend their money to create informational

websites with elaborate pages of content? Ultimately, I guess,

they do it for personal accomplishment. But many feel strongly

about their hobbies, interests, and causes, and they desire to

express themselves on the Web to further these. In the end, who

cares why they do it?... Strong motives can produce information

distortion, but let's be thankful that so many citizens have

put their information on the People's Medium. As professional

information retrievers, you'll make use of data from citizen

webpages because it's handy and relevant.


"Academia, government, and commerce have no monopoly on scholarship.

You don't have to go to college or have an intellectual vocation

to become a scholar. Every citizen can be an armchair scholar or

even a deadly-serious one. You'll see a good example of this when

you discover the 'Witchfinder General' webpage in your homework

assignment." The Rector smiled, benignly.


"If you publish 'original' information---that's what you write by

your own efforts, not what you copy from others---then in Questor

Institute's view, you're a citizen scholar. Citizens now have the

People's Medium to display their scholarship, but they could enjoy

some recognition of their works by the intellectual hierarchs of

our society. In that vein, we hope to establish at Questor a Website

Excellence award program to recognize the outstanding contributions

of the citizenry to society's online information pool. Yes, I know

that there are a multitude of questionable Web awards out there in

cyberspace, but as Questor Institute becomes better-known, its Web

awards will, I trust, be the most-highly valued by their recipients.


"Today's homework assignment is this, then: locate a citizen webpage

which provides a short-but-good biography of the 17th-century English

Puritan witch hunter who styled himself as "The Witchfinder General"

---and which provides a complete list of his many victims and their

'disposal.' In your report about this site, answer these questions:

(1) Does the Englishman who put up this page seem to have

a college degree?

(2) How does his 'authority'---or seeming lack of it---

matter to you?


"I'd like to end today's session with an anecdote about scholarship.

In California, there's an academic who's a self-appointed Internet

'contrarian'---gadfly. In 1995, he wrote this in a national magazine:


What the Internet hucksters won't tell you is that the Internet

is an ocean of unedited data without any pretense of completeness.

Lacking editors, reviewers, or critics, the Internet has become

a wasteland of unfiltered data. You don't know what to ignore

and what's worth reading.

Logged onto the World Wide Web, I hunt for the date of the

Battle of Trafalgar. Hundreds of files show up, and it takes

15 minutes to unravel them---one's a biography written by an

eighth grader, the second is a computer game that doesn't work,

and the third is an image of a London monument. None answers

my question...


"Recently, I read that passage reprinted in a book on search engines.

So I logged onto the Web, and in 15 seconds---not minutes---

I retrieved a search-results page on which six of the eight items

clearly stated the date of the Battle of Trafalgar. I didn't have

to click on those items; I found the answer in their annotations.

How can it be that I was so quickly successful, and that academic

'expert' was not?... I don't know how he searched for the date of

the Battle of Trafalgar, but I searched like a Questor would."

The Rector wrote on the whiteboard:


<-------------------ASSUMED SENTENCE------------------>

"The battle of Trafalgar was fought on October 21, 1805."



SEARCH INPUT: <"trafalgar was fought on">


"As you already know from your retrieval class, searching for

natural-language text requires that we employ natural-language

searchwords. Unlike Dr. Contrarian, I visualized a relevant text,

subject-analyzed it, and searched appropriately. How many of you

Questors would have searched this way?" Suddenly, the classroom was

filled with waving hands, not all of them honestly flourished, but

all of them strongly motivated.


"Of course, without examining the six search-return items I couldn't

establish an 'authority' for the date each offered. But I can assume

an 'inherent authority.' When six webpages agree on the date of a

historical event, this amounts to a practical authoritativeness for

that date. It seems unlikely that all six webpages would have gotten

it wrong, especially when there was no other date in the results.


"Many who take it upon themselves to harshly criticize the Internet

simply don't know the medium as well as they should. As Questors,

you'll study the Internet until it becomes 'second nature' to you.

You'll learn more than just how to retrieve from it---you'll learn

to respect it. Without that respect, the Internet surrenders its

secrets reluctantly.... When you approach a dog with love in your

heart, isn't it less-likely to bite you?"



"That was a powerful message," whispered Kevin. "I'm inspired."


Ignoring his sophomoric cynicism, Marylou replied, "So am I.

And it makes me feel good to know that I'm studying to enter the

latter-day equivalent of a medieval craft guild. When our class

graduates and shows the world what the word 'Questor' now means,

students'll be thronging to get in here. And we'll be remembered

as the Institute's pioneer class."


"Yippee! Circle the wagons!"


"That's not what I meant, Kevin."


"Hey sister, don't get me wrong. I can dig the guild thing."




2002 by Frederick Rustam. Frederick Rustam is a retired civil

servant. He formerly indexed technical reports for the Department of

Defense. He writes science fiction for Web ezines as a hobby. He

studies and enjoys the Internet as a hobby.