SEARCHING THE LITERATURE
Computer searching - Principles of
searching - Internet resources
If you are wondering what there can be to learn about the scientific literature, have a look at this story about The Lobster and the Microchip
The objectives of this tutorial about the literature of the sciences are:
- to provide a basic understanding of the structure, organization
and problems associated with scientific literature and its dissemination
to the scientific community;
- to introduce some of the sources of information in science and technology,
with particular reference to the major primary, secondary and tertiary
sources in the physical and life sciences, and
- to provide a framework for the exploration of the Internet as an
emerging medium for scientific and technical publishing.
The original detailed reports of scientific and technical research comprise the major part of the primary literature. Some of these reports may be largely observational or descriptive, but the majority are accounts of experimental work with results and conclusions. The primary literature represents new scientific knowledge and hence the latest available information. By its very nature the primary literature is widely scattered, disconnected, and unorganized. Its tremendous volume and diversity make it difficult to locate and apply, and as a consequence a second type of more accessible information sources has been developed; the secondary literature.
The secondary literature sources are compiled from the primary literature which they repackage in a more condensed and concise form, generally on the basis of subject content. In their most basic format they are simply descriptive lists of primary documents relevant to a particular subject field or topic. As such the secondary literature sources contain no new information and their role is to make the primary literature more accessible to the scientific community.
The tertiary literature is a less well-defined group of publications comprising literature guides and search aids, which are designed to help the scientist identify the primary and secondary literature sources in his/her particular research field.
The various forms in which scientific literature are published are common to all scientific disciplines, both pure and applied, and with minor modifications the pattern is also valid for most of the social science disciplines.
Top of page
Primary Literature
Secondary literature
The subject coverage of these journals varies considerably , ranging from multidisciplinary science journals, e.g., Nature, Science, Scientific American, which include limited numbers of articles from a wide range of scientific disciplines, to the highly specialized research journals, e.g., Carbohydrate Research, Journal of Low Temperature Physics, which are completely devoted to a specific research field.
Speed of publication
The main problem with journals as a medium for communicating the results
of research is the speed with which articles are published. It is not uncommon
for delays of six months or more to occur between the submission of a paper
to a journal and its publication. Part of this delay is caused at the editorial
checkpoint, for it is common practise with learned journals published in
English-speaking countries, to send submitted papers to one or more independent
referees for an authoritative opinion on originality, technical accuracy
and significance. This procedure, called peer review, helps to maintain
standards by eliminating the frequent worthless and occasional fraudulent
paper, but creates problems for researchers. Delay in publication is particularly
frustrating in science and technology, where the pace of advance is so
rapid, and much research effort is duplicated, at great expense, because
of a lack of awareness of completed work.
Several commercial publishers have tried to reduce this delay period by producing a new type of periodical called a 'Letters Journal'. These journals specialize in the rapid publication of short un-edited and un-referred articles. The journals are generally offshoots of conventional journals and most claim to publish articles within 2-3 weeks of receipt.
Another approach, which has been adopted by several journals, is to include a section devoted to previews of forthcoming papers. It is becoming increasingly common for journal publishers to offer, at least the titles, of forthcoming publications via the Internet. Scientists who urgently require details of a particular research project can then write directly to the author rather than have to wait several months for the article to be published. The most recent developments in the quest for speed of publication involve the use of electronic mail and fax for rapid communications between authors, publishers and reviewers, and the increasing use of the Internet as a publishing medium. Here, an essentially new class of published scientific communications called "preprints " has emerged, particularly in the field of physics, and is challenging the traditional methods of publishing scientific articles.
An obvious extension of this is electronic methods of publication via the Internet. There are, at present, a range of uses of these methods for dissemination of periodicals. It is not yet clear how this type of publication will affect the future of conventional scientific journals.
The thesis is the culmination of at least two (M.Sc.) or three (Ph.D.) years of research work and contains a descriptive account of the procedures and instrumentation used, the results obtained and the conclusions drawn from the work. They vary considerably in length, ranging from 100 up to as many as 1000 pages, and in addition to the account of the research undertaken, usually contain an extensive review of the literature associated with the particular field of study. It is not uncommon to find up to 500 references cited in the bibliography of a thesis. This latter feature is obviously of invaluable assistance to researchers new to the research field, as it saves them many hours of searching for relevant references.
It has been estimated that only 50% of the information and data contained in theses is subsequently published in the scientific periodical literature. Consequently theses and dissertations are the only source of information for almost half of the higher degree research work undertaken in the world's Universities.
The major problem associated with theses is availability as many universities are unwilling to loan copies of their theses for fear of plagiarism and the possibility of loss of a unique document. Several countries, including the United States and Canada, have established national microfilming programmes in which microfilm copies of all the theses submitted to the universities in that country are produced. The microform copies are then made available for either loan or purchase. A national microfilm programme has not been established in Australia, but the loan of Australian theses can generally be negotiated via the University libraries network.
Recently, the Australian Digital Theses Project has begun, which aims to provide a national collaborative distributed database of digitised Australian PhD and Masters by Research theses.
Some information about finding theses in the James Cook University Library is available on our theses page.
PATENTS
A great deal of confusion exists amongst scientists about patents and
patent literature. A patent is basically a legal agreement between the
state (national government) and an individual inventor, in which the state
grants the inventor a monopoly for a certain period of time, generally
twenty years, during which time the inventor has the sole rights to commercially
exploit the invention. In return for this legal protection the inventor
discloses to the public the details of the invention in the form of a patent
specification. Consequently it is the patent specification which is the
source of published information and not the patent.
Patent specifications are required to conform to a standard format ( Typical Patent Specification ). The first section of the specification includes the basic information necessary to identify the patent; title, name of patentee (inventor) which is generally a commercial organization, patent number incorporating a country code, and the dates of application and acceptance of the patent specification. This is the information that is required to order a copy of a specification from one of the national patent offices. The second section contains the detailed description of the invention. This section varies considerably in length, ranging from the average of about five pages up to several hundred pages. However, under patent law there is a legal minimum amount of information which the patentee is obliged to provide and that is specified as 'sufficient information to enable a specialist to reproduce the invention'. In the third section, the Claims Section, the patentee outlines the scope of the invention and his claims for its application. From a legal viewpoint this is the most important part of the specification as all the court cases over patent rights are centred around the claims.
An important point to remember is that individual patent specifications are only valid in the country granting the patent. If an inventor wants to protect his or her rights to an invention on a world-wide basis, then corresponding patent specifications must be filed in every country which recognizes the International Patent Convention ; currently 120 countries. This is a very expensive procedure. After three years, annual renewal fees have to be paid to keep a patent in force and these generally increase in value with the age of the patent. If a patent is allowed to lapse then the invention becomes fair game for unrestricted commercialisation.
The length to which companies will go to protect their rights can be instanced by the Polaroid camera, basically a single invention but protected by 1,100 patents.
It is currently estimated that around a million patent specifications are granted each year, with the United States alone granting around ninety thousand. Patents are probably the most neglected and yet one of the most valuable sources of scientific and technical information. They are the main form in which the results of industrial research, i.e., research undertaken by commercial organizations such as BHP, ICI, Shell etc., are published. Commercial organizations unlike universities undertake research and development work with the prime objective to produce some form of marketable product, e.g., a new pharmaceutical compound, a new range of insecticides, an improved solar cell etc. If and when this objective is achieved the organizations obviously want to protect their invention from exploitation by their commercial competitors and this protection is acquired by filing patents on the invention.
The relevance and value of patent literature varies between the different scientific disciplines but is generally of greater value in the physical and applied science, e.g., chemistry, physics, metallurgy, engineering etc., than in the life sciences, although the expansion of biotechnology and gene technology is beginning to change this. Recent court cases have focussed attention on the issue of patent rights in therapeutic drugs.
Searching for patent information
Searching for patents on a particular topic is facilitated by the use
of an International Patent Classification administered by WIPO, the World
Intellectual Property Organization . A sample of this classification can
be seen at their web site. The World Intellectual Property Organization
is a United Nations Agency "responsible for the promotion of the protection
of intellectual property throughout the world through cooperation among
States, and for the administration of dealing with the legal and administrative
aspects of intellectual property." WIPO administers a range of Unions or
treaties in the field of industrial property.
Full access to databases of patent application information is usually provided for a fee by major database vendors. Libraries, private companies and Patent Attorneys will also provide this service for a fee. A variety of patents and patent-related information is available via the Internet. For example, the European Patent Office (EPO) maintains a comprehensive list of patent related web-sites as a user service
Reports of Industrial Research
These are progress or final reports of research work carried out by
commercial organizations. This type of report is usually prepared to inform
the management of these organizations about the stage of development of
a particular research project. Consequently they are generally confidential
and are not available to the general public. Some of the information contained
in these reports is eventually incorporated into the Patent literature.
Reports of Academic Research
It has become a trend over the last ten years for university departments
to produce their own research report series. They are generally produced
to publicize the research activities of the department, presumably in an
attempt to enhance its reputation and to attract research funds.
Another type of report produced by academic institutions are those produced as part of a consultancy or commercial enterprise. There may be a conflict of interest in situations where an academic would wish to distribute information resulting from research work but is constrained from doing so because of commercial confidentiality.
Reports of Government-financed Research
This is the main category of technical report and constitutes approximately
ninety percent of the total report literature.It includes reports of research
undertaken by government departments and organizations, e.g. CSIRO, and
research undertaken by non-government organizations, e.g., universities,
private laboratories, commercial organizations, which is financed by the
government. This type of report is generally identified by a unique report
number, AD250778, PB213078, EPA 7133.
The reports are also published in series and the alphabetic characters in a report number identify the series to which it belongs, i.e., AD (reports of the U.S. Department of Defense ), EPA (U.S. Environmental Protection Agency) , PB (National Technical Information Service ). Approximately twenty thousand alphabetic report series codes are currently in existence. The report number is a very important feature of a report as it is a unique identifier and it should always be quoted if known.
The proceedings of these conferences are generally published in one of two formats; either in book form or as a special issue or supplement to a periodical. More recently, the proceedings of some conferences have been published on CD-ROM, while other "electronic" conferences occur on the Internet using all the facilities of the World Wide Web. In the case of large international conferences the reporting of the proceedings is often limited to summaries or abstracts of the papers presented. Unfortunately the proceedings of many conferences are never published in any form and their content is consequently lost to the scientific community.
Such exponential growth, with a doubling time variously estimated at between eight and twenty years, cannot continue indefinitely. In the developed countries we are probably approaching a limit where the proportion of scientists in the population is reaching saturation. In the underdeveloped countries, particularly China, there is no such saturation and indeed we may well see from this source an escalation in the growth of science which will more than compensate for any reduction in its growth rate in the developed world.
To put these figures into a more practical perspective, a new graduate chemist will have x units of information available at the start of his career, but when he retires 45 years later the total amount of available information will have expanded to 8x units.
It is obvious that it would be totally impractical to attempt to locate relevant information by searching through the primary literature. Some form of guide to the content of the primary literature is required and this is provided by the secondary literature sources. There are several different types of secondary source but they all have a common purpose and objective, and that is to collect together in a single publication, details of a large number of primary documents relevant to a particular subject field or topic, e.g., chemistry, genetics.
The second part of the entry is an abstract or summary of the content of the article. The amount of detail given in the abstract varies between abstracting journals but they are generally merely descriptive with no attempt at any evaluation. The abstract is very useful in assessing the relevance of particular articles, in fact in certain instances the abstract may provide the required information, e.g., a reaction constant, spectroscopic data, physical constant etc
It is a common practice to abbreviate journal titles in abstracting journals. Unfortunately there is little standardization in the abbreviations and variant forms of abbreviation appear in different abstracting journals for the same primary journals. Each abstracting journal publishes a list of the abbreviations used with their full title equivalents and this is usually incorporated as a section in the annual index or is issued as a separate publication, e.g., The BIOSIS list of Serials and the Chemical Abstracts Service Source Index which list the abbreviations used in Biological Abstracts and Chemical Abstracts respectively.
The majority of abstracting journals present the abstracts numbered serially and arranged into broad subject sections. The number of abstracts included in the various abstracting journals varies tremendously. In the more specialized journals, e.g., Electron Microscopy Abstracts, Nuclear Magnetic Resonance Spectrometry Abstracts, there may be only 2000-3000 abstracts per year, but in the journals with a broader subject coverage, e.g., Chemical Abstracts, Biological Abstracts, there can be as many as 400,000 per year.
Most of the abstracting journals provide a range of indexes to the abstracts. Most provide some form of author and subject index but some also included very specialized indexes, e.g., indexes of taxonomic species, patent numbers, molecular formulae of chemical compounds, report numbers etc. In all forms of index the index headings generally refer back to the relevant abstract numbers.
Most abstracting journals are published either weekly, fortnightly or monthly and each issue normally has its own index (or indexes). The individual issue indexes are cumulated either annually or semi-annually to make searching for relevant abstracts easier. A number of abstracting journals, e.g., Chemical Abstracts, Physics Abstracts also compile collective indexes for longer time periods, generally five years. The cumulative indexes make searching very much quicker since you only have to look up an index term once in an annual index to find all of the relevant abstract numbers rather than the 26 times necessary if you used the fortnightly issue indexes.
TYPES OF SUBJECT INDEXES
The major problem encountered in the production of a subject index
is the problem of synonymous terms, i.e., the use of a number of different
terms or phrases to represent the same concept. For instance, the terms
PERMITTIVITY, CAPACITANCE, DIELECTRIC CONSTANT, INDUCTIVITY CAPACITY, PERMEABILITY
{ELECTRIC} are all terms which can be used to describe PERMITTIVITY. Which
of these terms should be used to index articles relating to PERMITTIVITY?
Subject indexes can generally be categorized as either vocabulary controlled or free language indexes on the basis of their treatment of synonymous
terminology.
VOCABULARY CONTROLLED INDEXES
In a vocabulary controlled index the subject headings used are selected
from a standard word authority file (often referred to as a Thesaurus).
The problem of synonymous terms is overcome by selecting one of the terms
as the preferred term, i.e., the term to be used in the subject index (e.g.
PERMITTIVITY) and cross-referencing all the other terms to the preferred
term in the index e.g., (DIELECTRIC CONSTANT see PERMITTIVITY). All articles
relating to dielectric phenomenon will then be indexed under the subject
heading PERMITTIVITY.
The advantage of this type of index is that you only have to look under the one heading to find all the relevant information with consequent savings in time and effort. The major disadvantage of vocabulary controlled indexes is the problem of finding the correct subject headings to define your interest. Many of the abstracting journals, e.g., Physics Abstracts, Chemical Abstracts have published their thesauruses as a separate document and it is essential that you consult them to determine the relevant preferred terms before using the subject indexes. The thesauri may also help you locate related terms of interest which you had not previously thought of.
FREE-LANGUAGE INDEXES
In a free-language index no attempt is made to overcome the problem
of synonymous terms, consequently it is necessary to look up all the possible
terms in the index if relevant information is not to be missed. In this
respect free-language indexes are less efficient and more time consuming
to use than vocabulary controlled indexes.
Free-language indexes are generally computer-generated indexes which use the keywords and terms in the titles of the documents to be indexed as subject headings. These computer produced keyword indexes are generally of two types; KWIC (KeyWord In Context) or KWOC (KeyWord Out of Context).
In a KWIC index, each significant term in the title of the article to be indexed is used as an index heading. The words are arranged in an alphabetical sequence in the central column with the remainder of the title arranged in its natural order around the index term. The number of entries in the index for any article is therefore dependent upon the number of significant terms which appear in the title of the article. An important feature to notice with KWIC indexes is that the titles are generally truncated because the index line has a prescribed maximum length. This feature can be a very limiting factor (as you will discover if you use Biological Abstracts) because it is often difficult to assess the relevance of certain documents from the incomplete titles in the index.
KWOC indexes operate on a similar principle to the KWIC index except that the keywords are listed separately with the titles of articles containing the keyword listed below. Truncation of titles is very rare in KWOC indexes so they are easier to use in this respect. Examples of KWOC indexes can be found in Dissertation Abstracts.
It should be stressed that there is no standardization of terminology in these free-language indexes and it is essential to look up all the synonymous terms if relevant documents are not to be missed.
It is estimated that over two thousand abstracting and indexing journals covering science and technology are currently published. The diversity of subject coverage is immense with some covering broad subject fields, e.g., Chemical Abstracts, Physics Abstracts, whilst others cover narrow specializations, e.g., Air Pollution Abstracts, Current Advances in Genetics. There is considerable overlap in the subject coverage of these journals and it is very common to find an entry for a particular periodical article appearing in several abstracting or indexing journals.
In addition to the subject-based abstracting and indexing journals, there are also several journals which restrict their coverage to the literature published in a particular country, e.g., Australian Public Affairs Information Service which only covers literature published in Australia, or to specific types of literature, e.g., Dissertation Abstracts International, which limit its coverage to dissertations.
There are a number of points to remember when using abstracting/indexing journals:
- Choose the most appropriate journal for your purpose. To a large
extent this is a matter of practice and experience, however, a bit of logical
thought can greatly assist the process.
You can get help in choosing the right index from the Library staff, or from
the guide for your subject available in print or on the Subject
Resources page. Always use the cumulative volume indexes, if available,
in preference to the individual issue indexes.
- Delay periods of 6-9 months between the publication of an article and the appearance of an abstract of the article in one of the abstracting journals are common, so do not expect to find all articles published in 1978 included in the 1978 volume of an abstracting journal. A general rule would be to allow 6 months for an article to be listed in one of the secondary journals. These delays are being reduced as abstracting journals become available in electronic form vie the Internet.
The first citation index to be published was Science Citation Index which commenced publication in 1963. Since then citation indexes covering the Social Sciences and Humanities have also been produced. Science Citation Index is a multi-disciplinary science index which covers all areas of both the pure and applied sciences. Literature coverage is restricted to the periodical literature with approximately 9,000 of the world's most important scientific and technical periodicals being regularly indexed.
Citation indexes are also often used to track the citations that have been made of an individual authors work.
The James Cook University Library subscribes to Web of Science which provides electronic access to three citation indexes: Science Citation Index Expanded, Social Sciences Citation Index, and Arts & Humanities Citation Index. These and other indexing databases are available via the Library's Electronic Databases page.
Current-awareness journals covering most scientific disciplines are currently being published and all claim to report the primary literature within three months of its publication. There is also a growing number of electronic current-awareness databases , available through the Internet.
The journals generally conform to one of three formats:
- Reproduction of the content pages of specific primary journals,
- Subject classified list,
- Computer-generated title keyword indexes, i.e., KWIC indexes
REPRODUCTION OF JOURNAL CONTENTS PAGES
The most well-know example of this type of current-awareness publication
are the Current Contents series of journals. There are six separate
journals in this series and five of these relate to scientific disciplines.
The two Current Contents journals of most relevance are Physical, chemical
and earth sciences and Life sciences . Each journal is composed
of photographic reproductions of the contents pages of the current issues
of selected journals. The contents pages are organized into broad subject
sections and scientists can keep up to date with the literature by either
scanning particular subject sections or the contents of specific journals.
The contents of over 7,000 journals, including some 800 Australian and
New Zealand titles, are regularly included in the six Current Contents
journals.
Current Contents is also available in a number of electronic formats.
SUBJECT CLASSIFIED LISTS
This is the most common format and the current-awareness journals in
Physics Current Papers in Physics, Electronics Current Papers
in Electrical and Electronics Engineering and Mathematics Current
Mathematical Publications, conform to it. Scientists keep up-to-date
with the research literature by scanning the appropriate subject sections
in each of the fortnightly issues. This type of current-awareness journal
is generally based on the content of an associated abstracting journal,
e.g., Current Papers in Physics - Physics Abstracts, Current Mathematical
Publications - Mathematical Reviews, but as abstracts are not included
the articles are listed more quickly. The issues of these current-awareness
journals become redundant after about six months when abstracts of all
the listed articles have been published in the associated abstracting journal.
COMPUTER-GENERATED TITLE KEYWORD INDEXES
These are generally computer-produced KWIC indexes which use the keywords
in the titles of articles as index headings. As abstracts are not included,
the journals are produced quickly with consequent rapid reporting of the
periodical literature. The major example of this type of journal is
Chemical Titles, a current-awareness journal covering the chemistry
and biochemistry literature.
ELECTRONIC CURRENT AWARENESS DATABASES
Many current awareness publications are available in electronic format.
JCU Library has a guide to Alerting & Current Awareness Services which provides access to many of these.
Review series are periodicals which specialize in the publication of articles which review the progress and developments which have taken place in a particular subject field over a specific period of time. Most of the review series in science are published annually in monographic (textbook) form and they tend to have very similar titles, e.g., Advances in ..., Annual Review of ....Annual Reports on ..., Progress in .... Each review series is devoted to a specific field of study e.g., carbohydrate chemistry, lipid research, atomic physics, and each of the annual volumes in the series contain several reviews of particular topics relating to the subject field.
Review articles of this type are useful for three reasons:
- the articles are normally written by specialists in the subject
field to be reviewed so the reviews are generally authoritative,
- the articles are based on a survey of the literature associated with
the subject field and as a consequence they usually include an extensive
bibliography. It is not uncommon to find a review article with a bibliography
of two or three hundred references, which saves you from the problem of
having to find them yourself.
- Probably the most important factor of all, the reviews are often
both selective and evaluative. Selective, in that the reviewer may have
selected from the great volume of primary literature, those articles and
documents which she or he considers have made a significant contribution
to the field. Evaluative, in that the reviewer may evaluate the contents
of the documents he cites, i.e. the reviewer may comment on the accuracy
of a particular experimental technique or on contradictory experimental
data. This selective/evaluative feature of reviews is very important because
it helps to identify the important and significant documents from the mass
of relevant literature.
The basic limitation of these review series is the currency of the information which they contain. Review articles generally take considerably longer to write than the articles in the conventional primary journals because of their greater length, (up to 100 pages) and the volume of primary literature which the reviewer has to read to produce the review. Add to this the increased delays in publishing articles in a textbook format and you find that the most recent document cited in one of these review articles is generally at least a year old. Consequently review articles are useful as sources for older literature but cannot be used to keep up to date with recent material.
There are also several conventional primary journals, e.g. Chemical Reviews, Reports on Progress in Physics, which restrict their content to review articles, and occasional reviews can also be found scattered throughout the scientific periodical literature
There is also a quick summary of how to find a journal article
It was the introduction in the mid 1960's of computer photocomposition techniques to the production of abstracting and indexing journals, that made automated literature searching a feasible alternative to the manual search. Machine readable (computer) files of most of the world's abstracting/indexing journals and many standard scientific reference publications are now available for online computer searching. A computer search can achieve in several minutes, results which would take hours or even days of manual searching.
There are now over 4,000 databases in existence extending across science, technology, business, the social sciences and humanities. These computer databases contain all the information included in the printed versions including the complete bibliographical details of each item, the abstract (if available) and all the indexing terms. Computer searching may mean accessing a databse held on a remote computer using either a telephone modem for connection or an Internet connection. Alternatively the machine readable database may be a CD-ROM disk held in the Library.
The computer can be used to locate documents on a particular topic, by a specific author, in a particular journal or published in a specific year. The most common form of searching is by subject and the general principles of computer based subject searching are outlined in the next section.
Concept analysis
This sounds complicated but is in fact very simple and involves breaking
down the search into a number of subject concepts. To illustrate with our
sample search, the search topic can be broken down into two concepts;
1. EARTHQUAKES/ROCK FRACTURES and 2. ELECTROMAGNETIC WAVES
The number of concepts in a search topic will vary but most topics
can be defined in three or less concepts. This process needs to be undertaken
whether you are using a print or computer based search tool.
Synonym Analysis
We have already encountered the problem of synonymous terms in the
scientific literature and the problem needs to be taken into account when
searching. The next stage in the search is to list all of the possible
terms and phrases that could be used to define each of the concepts of
the search, e.g.,
| EARTHQUAKES | ELECTROMAGNETIC WAVES |
| ROCK FRACTURES | ELECTROMAGNETIC NOISE |
| ELECTROMAGNETIC EMISSION | |
| RADIO WAVES etc |
Truncated terms
Several of the search terms and phrases listed on the previous page
can appear in a number of slightly variant forms, e.g., EARTHQUAKES could
be EARTHQUAKE, WAVES could be WAVE etc. Often, terms can appear in an adjectival
or verb form, e.g., POLLUTION, POLLUTED, POLLUTANT, POLLUTE etc. If you
are using a computer based search tool, you need to consider how to deal
with this problem. As a computer search simply involves matching your search
terms against those appearing in the database, all variant forms should
ideally be listed. This would be an extremely tedious procedure so most
search systems provide a word truncation facility to simplify the process.
By way of example, the search term POLLUT? would match against all of the
words shown below;
POLLUT? matches POLLUTION, POLLUTE, POLLUTANTS etc
The ? is an example of a truncation symbol. Different symbols are used in different databases to find the various different words with the same word stem. The search terms in our sample search
can be modified to:
| EARTHQUAKE? | ELECTROMAGNETIC WAVE? |
| ROCK FRACTURE? | ELECTROMAGNETIC NOISE? |
| ELECTROMAGNETIC EMISSION? | |
| RADIO WAVE? etc |
Term proximity
Most computer search systems also allow you to specify the position
of search terms in relation to each other. For example, we want to search
on the phrase ELECTROMAGNETIC WAVE? so the two terms should be specified
as being adjacent to each other, e.g., ELECTROMAGNETIC(W)WAVE?, by the
use of the (W) adjacency symbol which is employed in some databases. If the adjacency requirement is not specified
then the computer may find the term ELECTROMAGNETIC in the title of a document
and the term WAVE in the abstract of the document, but the document may
have no content relating to ELECTROMAGNETIC WAVES.
Our sample search can now be modified as follows:
| EARTHQUAKE? | ELECTROMAGNETIC(W)WAVE? |
| ROCK(W)FRACTURE? | ELECTROMAGNETIC(W)NOISE? |
| ELECTROMAGNETIC(W)EMISSION? | |
| RADIO(W)WAVE? etc |
Logic Analysis
Whilst we have so far listed all the appropriate terms and phrases
to define the various aspects of the search, we have not yet defined the
relationship between these terms. If we are undertaking a search on a computer
based search tool, we need to do this. Relationships are defined by using
Boolean logic, that is by the terms AND, OR and NOT. To illustrate the
use of the Boolean operators, in our sample search we require any one of
the "ELECTROMAGNETIC WAVE" phrases to appear in a record but not necessarily
all of them, i.e., we want:
ELECTROMAGNETIC(W)WAVE? OR
ELECTROMAGNETIC(W) NOISE? OR
ELECTROMAGNETIC(W)EMISSION? OR etc
However, we require at least one term from each concept list to appear in a record, i.e, we require the term EARTHQUAKE? OR ROCK(W)FRACTURE? AND at least one of the "electromagnetic wave" concept terms. One possible search strategy, with its Boolean logic features, then would be:
(ELECTROMAGNETIC(W)WAVE? OR ELECTROMAGNETIC(W)NOISE? OR ELECTROMAGNETIC(W)EMISSION?) AND (EARTHQUAKE? OR ROCK FRACTURE?)
Note the use of parentheses to specify the order for the logical operators.
NOT logic is used to specify aspects of the search which are to be excluded from the search results. For instance, we may not require any documents relating to earthquake detection in Japan, i.e, NOT JAPAN:, or written by a particular author, e.g., NOT SMITH,J: /AU. You need to be very careful when deciding to use the NOT operator because you exclude every reference that contains the term you have applied NOT to, whatever else that reference includes. It is easy to exclude useful material.
This information on Combining Search Words further illustrates Boolean operators.
There are three ways to locate information via the World Wide Web;
- if you have been given the address of a site of interest (the
URL or Universal Resource Locator) this is the most successful method.
Some journals and newspapers now review Web sites in the same way that
they review books and there is a brisk exchange of information between
specialists in any scientific field.
- many organisations now maintain pages of links to other sites of
relevance to their work and this process of going from a known link to
a new one ('surfing') remains a good way to find material.
- if you are not able to use the options above, you can try one of the several search engines or directories but none of them is complete and you need to use all the search skills you have developed to get the most from your search. James Cook University Library has a series of links to some useful Internet searching facilities.