adequate model for its time and place. But after a while,
each has been found to be incorrect and inadequate
and has had to be replaced by another model that more
accurately portrayed the real world and its behavior.
Copernicus presented us with a new point of view
and laid the foundation for modern celestial mechan-
ics. That view gave us the basis for understanding the
formerly mysterious tracks of the sun and the planets
through the heavens. A new basis for understanding is
available in the area of information systems. It is
achieved by a shift from a computer-centered to the
database-centered point of view. This new understand-
ing will lead to new solutions to our database problems
and speed our conquest of the n-dimensional data
structures which best model the complexities of the
real world.
The earliest databases, initially implemented on
punched cards with sequential file technology, were not
significantly altered when they were moved, first from
punched card to magnetic tape and then again to mag-
netic disk. About the only things that changed were
the size of the files and the speed of processing them.
In sequential file technology, search techniques are
well established. Start with the value of the primary
data key, of the record of interest, and pass each record
in the file through core memory until the desired record,
or one with a higher key, is found. (A primary data key
is a field within a record which makes that record
unique within the file.) Social security numbers, pur-
chase order numbers, insurance policy numbers, bank
account numbers are all primary data keys. Almost
without exception, they are synthetic attributes spe-
cifically designed and created for the purpose of unique-
ness. Natural attributes, e.g. names of people and
places, dates, time, and quantities, are not assuredly
unique and thus cannot be used.
The availability of direct access storage devices laid
the foundation for the Copernican-like change in view-
point. The directions of"in" and
"out"
were reversed.
Where the input notion of the sequential file world
meant "into the computer from tape," the new input
notion became "into the database." This revolution in
thinking is changing the programmer from a stationary
viewer of objects passing before him in core into a
mobile navigator who is able to probe and traverse a
database at will.
Direct access storage devices also opened up new
ways of record retrieval by primary data key. The first
was called randomizing, calculated addressing, or hash-
ing. It involved processing the primary data key with a
specialized algorithm, the output of which identified a
preferred storage location for that record. If the record
sought was not found in the preferred location, then an
overflow algorithm was used to search places where the
record alternately would have been stored, if it existed
at all. Overflow is created when the preferred location
is full at the time the record was originally stored.
As an alternative to the randomizing technique, the
654
Copernicus completely reoriented our view of astro-
nomical phenomena when he suggested that the earth
revolves about the sun. There is a growing feeling that
data processing people would benefit if they were to
accept a radically new point of view, one that would
liberate the application programmer's thinking from the
centralism of core storage and allow him the freedom to
act as a navigator within a database. To do this, he
must first learn the various navigational skills; then he
must learn the "rules of the road" to avoid conflict
with other programmers as they jointly navigate the
database information space.
This reorientation will cause as much anguish among
programmers as the heliocentric theory did among
ancient astronomers and theologians.
Key Words and Phrases: access method, attributes,
calculated addressing, celestial mechanics, clustering,
contamination, database, database key, database set,
deadlock, deadly embrace, entity, hash addressing,
overflow, owner, member, primary data key, Ptolemy,
relationship, retrieval, secondary data key, sequential
file, set, shared access, update, Weyerhaeuser
CR Categories: 3.74, 4.33, 4.34, 5.6, 8.1
index sequential access technique was developed. It
also used the primary data key to control the storage
and retrieval of records, and did so through the use of
multilevel indices.
The programmer who has advanced from sequential
file processing to either index sequential or randomized
access processing has greatly reduced his access time
because he can now probe for a record without sequen-
tially passing all the intervening records in the file.
However, he is still in a one-dimensional world as he is
dealing with only one primary data key, which is his
sole means of controlling access.
From this point, I want to begin the programmer's
training as a full-fledged navigator in an n-dimensional
data space. However, before I can successfully describe
this process, I want to review what "database manage-
ment" is.
It involves all aspects of storing, retrieving, modify-
ing, and deleting data in the files on personnel and pro-
duction, airline reservations, or laboratory experiments
--data which is used repeatedly and updated as new
information becomes available. These files are mapped
through some storage structure onto magnetic tapes or
disk packs and the drives that support them.
Database management has two main functions. First
is the inquiry or retrieval activity that reaccesses previ-
ously stored data in order to determine the recorded
status of some real world entity or relationship. This
data has previously been stored by some other job,
seconds, minutes, hours, or even days earlier, and has
been held in trust by the database management system.
A database management system has a continuing re-
Communications November 1973
of Volume 16
the ACM Number 1 !