Computer-based Simulation and Modelling
Michael D. Fischer


1 Simulation

When we apply a rule or set of rules to a set of input information, we derive a set of results
corresponding to the inputs. The process of deriving this set of instances of applying the
rule can be called simulation. The rule is a model which describes relationships, the
application of the rule to generate an outcome is a simulation.

For example, if we take a rule adapted from Islamic Shariat law:

  If talaq is said three times in succession to the wife before two male witnesses the marriage is
dissolved, otherwise the marriage continues.

and apply this rule to the following information – talaq was said twice to the wife before
two male witnesses – we arrive at the result that the marriage continues.

Simulations of this sort are nothing new to anthropologists – we do them all the time in our
head or on paper to validate our analyses against data, to explore the properties of our
models, or to extrapolate our models to new situations.  Such simulation, predating
computers, has been used in anthropology at least since the nineteenth century  (Mulvaney
1970). Simulation, with caution and reservation, is used to observe rituals, ceremonies
and activities, which for some reason cannot be observed in the ordinary course of
fieldwork (Clammer 1984:72–3; Ellen 1984:274). Indeed, formal interviews and ‘set-ups’
(Jackson 1987:41) meet this sense of simulation to some extent. Finally, we practice
simulation each time we ‘play out’ our models and analyses in our minds or on paper,
testing against observed data, and evaluating the results.

The use of computers for simulation modelling were among the first encounters of social
anthropologists with computing (Kundstater et al 1963; Gilbert and Hammel 1966), not
only because simulation met more or less the conception of what computers did in the early
1960s, but also because anthropologists at that time were beginning to explore the use of
more sophisticated models and attempting to apply a more systematic perspective to
anthropology.

1.1 Introduction

Simulation is a kind of modelling which is useful for wide range of problems and
situations. It has applications to both quantitative and qualitative problems with either very
good data, or very little data. It has important implications for disciplines such as social
anthropology which are basically non-experimental, providing a means of exploring
problems which could never be observed to order. Simulation is not a panacea for all of
our problems, but it can be an important tool for the social researcher aware of its
limitations.





  Simulation 2


Simulations are distinguished from other kinds of models more in terms of goal than form.
Simulations are typically used for problems which are seen as complex and intractable,
where no direct means of evaluation is known or the conventional means of evaluation is
extremely difficult to execute, or which requires interactive decisions by the investigator
during the course of the model.

In social anthropology the most common (and successful) simulations have been based on
the interaction of models of prescriptive or preferential marriage, incest or other social
phenomena with either demographic models or ecological models (or both) (Kunstader et
al 1963; Hammel and Gilbert 1965; Randolph and Coult 1965; MacCluer and Dyke 1976;
Black 1978; Relethford 1981; Buchler et al. 1986). The fundamental idea underlying these
simulations is to investigate the performance of social models in context with ‘well-
understood’ models, including the ethnographic model of collection.

Although simulations can be quite abstract and analytic,most  anthropologists tend to
favour those which are fairly concrete. One reason for this is the emphasis of social
anthropology on structural relationships between individuals. If you are investigating the
feasibility of literal prescribed matrilateral cross-cousin marriage (c.f. Kundstater et al.
1963), then you must usually simulate a population as a set of people, not as a simple
aggregate. Each simulated person must have at least a mother and father, an age, a gender,
a marital status, and be subject to birth, marriage and death, and have, in some cases, a
history.

A simulation animates our models to produce data which we can use to evaluate these
models. This is of course possible to do without computers, but is a very time consuming
effort. Although most simulations have been applied to theoretical situations where
simulation was most useful precisely because it was not possible to observe these directly.
In the past few years simulations have been applied back to the field with promising
results.

Lansing (1991) describes a simulation which resulted from his fieldwork in Bali regarding
the role of water temples and the rituals associated with these and the regulation and
conservation of irrigation water for rice cultivation, and more controversially, their role in
pest control. Although a large part of the simulation related to ecological parameters, the
overall significance depended heavily on ethnographic data relating to how the water
temples functioned ritually as well, and how information flowed from the water temples to
the peasants who used irrigation water for their crops. It appears that among the results of
the simulation project was providing a basis for reversing official policy towards the water
temple system by the state and development agencies, which are now recognised by the
state and ‘have regained informal control of cropping patterns in most of Bali’. (Lansing
1991:125)

Kippen (1988) applied a novel version of simulation, using a production system/expert
system (§2) to represent indigenous knowledge about improvising tabla music, animated





  Simulation 3

this model, creating not a literal set of recordings, but an improvisational ‘performance’ by
Kippen’s model;    literally something new but conforming to a pattern which his expert
consultants (tabla musicians) could make judgements about, criticise, and set a context for
Kippen to elicit new information on which to base modifications to the expert system
rules.

In the past we could argue that there was no real way to produce a directly testable model.
We can not yet produce formally provable models, but there is no reason, though, why at
a micro-level, we cannot make statements about what we believe we know, and evaluate
this with respect to what we think should be the outcome. Analysis should at least be
subjectable to a test of the internal consistency of the representation, regardless of how we
want to argue about the external reliability or lack thereof.

1.2 Uses of simulation modelling

Simulation can be defined in either a broad or narrow sense. In the narrow sense
simulation is a model where we are attempting to describe the behaviour of a system by
incrementally and interactionally applying a number of models against some starting
situation. In the broader sense, it is any kind of computer model within which some
structure is being modified along one or more dimensions. In the usual case one dimension
is time, but this is neither sufficient nor necessary for a simulation; it is the modelling of
any complex system where observation of change, or an incremental process, is central.
Both of these descriptions have in common the modelling of some structure under
modification or transformation; the behaviour of some data object along one or more
dimensions of change.

Traditionally simulation has been a technique used for quantitative analysis. As with
computing techniques in general this is due to the historical development of computing and
constraints on our knowledge of how to represent models and information of a qualitative
and symbolic form. Designing a simulation involves translating the essential aspects of
pre-existing models into an form which can be implemented on a computer so that we can
monitor the interaction of the models.

1.3 What is a computer simulation?

Despite my assertions in §1.2, this is, unfortunately, not such an easy question to answer
from the literature. As with many terms in in the literature, computer simulation is not so
much defined as described.  Johnson ‘defines’ a computers simulation as

  ...a computer program that defines the variables of a system, the range of values those
variables may take on, and their interrelations in enough detail for the system to be set
in motion to generate some output. The main function of a computer simulations is to
explore the properties and implications of a system that is too complex for logical or
mathematical analysis... A computer simulation generally has an ad hoc or “homemade”
quality that makes it less rigorous than a mathematical model
 (1978:186-7)





  Simulation 4



Nardi offers as a description:

  A computer simulation model ... [provides] the investigator with a simplified analogy
... for the purpose of better analysing and understanding ... [some] phenomenon. ... it
focuses on conducting experiments on a computer in which mathematical or logical
operations describing the behaviour of a system over time are of primary importance.
... Its very purpose, in fact, is the analysis of change over time. ... Computer
simulation is a powerful technique, capable of handling large numbers of variables
representing complex systems and of simulating the operation of these variables over
many cycles.
 (1980: 38)

Davies and O'Keefe, in a programming guide for simulation, suggest:

  When the word [simulation] is used by computer scientists, statisticians, and
management scientists, they normally refer to the construction of an abstract model
representing some system in the real world. The simulation describes the pertinent
aspects of the system as a series of equations and relationships, normally embedded in
a computer program. (1989:1)

These descriptions of simulation might be adequate for many purposes, but it is worth a
closer look at a more structural definition of simulation in context of how anthropologists
have used them and might use them, if for no other reason than to provide a basis for
designing simulations and interpreting the results, a subject treated with extreme coyness
in the literature.

Specifying the structure of a simulation is very much like specifying any problem for
computer treatment: we have to visualise what we want to get out of a simulation, and
devise a structure that will fulfil this goal. Most descriptions of simulation, especially those
by anthropologists, suggest that a simulation model is:

a) used to model systems too complex to model with ordinary analytic models
b) comprised of logical or mathematical operations on variables
c) able to represent a large number of variables and operations
d) holistic in orientation, often used to model  aspects of entire systems
e) oriented towards representing processes
f) oriented towards exploration and experimentation

Items a and b suggest that although simulations are composed of analytic models,
simulations are not necessarily analytic models (Johnson 1978:186–7). In other words,
simulations can be informal models defined partly in terms of formal models 18. Items c
and d reflect a view that simulations are suitable for modelling complex large scale systems
as well as simple small scale systems (Johnson 1978:187; Nardi 1980:38), and are thus
‘capable of greater realism’ (Johnson 1978:187). Item e expresses a pervasive attitude that
simulations model processes. It is more accurate to say that simulations produce their





  Simulation 5

results from one or more processes. These processes may be used as part of a model of
systemic processes, or they may simply be artefacts of the simulation 19. Most simulations
have been used to model processes, but incorporate both relevant and artefactual
processes. Item f indicates a pre-analytic bias in the use of simulation; simulations are
oriented towards producing model data for analysis, not solutions (Buchler et al 1986:
112), although the computer implementation of a simulation model may include analytic
models which apply to the model generated data, and simulations may be used in some
cases when no other solution is plausible or possible (Dyke 1981:204).

These properties suggest a number of possible structures, but are not complete. What
distinguishes a simulation model from any other model forms is not so much the type of
model, but what we do with the model. In the case of simulations we are interested in the
behaviour of a model; instances of application of a model. Simulations do not have
solutions in the  conventional sense. The most appropriate purpose of a simulation is to
generate data, representing the interaction of the models under simulation. The value and
purpose of a simulation follows from what is done with this model data (Dyke 1981:204).

Extending this, I propose a more general structure for computer simulations.  Abstractly, a
simulation model consists of at least one structure, at least one operation which might act
on the structure(s), and at least one opportunity to apply operation(s) to structure(s)
(Figure 1). It is the applications of one or more models to create one or more instances. An
operation may or may not be based on an analytic model; it can be quite ad hoc. A
simulation is at least one instance of an application of operation to structure. This definition
does not differentiate between the application of analytic models, such as a discriminant
function derived from social data, and less formal models, such as those derived from so-
called qualitative analysis of social data.


  simulation_1.gif
Figure 1.  Abstract model of simulation

1.4 Types and Uses of Simulation

There are a number of approaches to simulation depending on the purpose to which the
simulation is intended, the models available, and the information available. However, the
general form of a simulation is fairly regular. Simulations are almost always used to
validate, explore or extrapolate the properties of a model of some system. The basis of
simulation is modelling the behaviour of a model, and thus it is possible to have a very





  Simulation 6

simple simulation consisting of a single model operation applied to a single model object
(or structure) .  However, a simulation is usually a model of a system consisting of at least
two sub-models. which interact either by one being directly influenced by the other, direct
mutual influence, or where both operate on at least one common object (Figure 2).

One general motivation for simulation is that we may lack the means to substantively
describe or characterise the interaction of these sub-models using either prose or formal
devices, but we can observe their interaction case by case. Simulation is then suitable for
those situations which are beyond our normal means to model in a concrete fashion. It is
notable that simulations are widely used in applications of the physical sciences in
engineering, for although these disciplines have very good models for describing the local
interaction of variables, they too, in general, lack the means to describe or predict the
interactions that a complex system might have. Thus airplanes are not built on the basis of
first principles of physics, but are designed using  these principles, and the designs are
refined, first using computer simulations, and then physical simulations such as a wind
tunnel.

There are two basic types of simulation components, stochastic and deterministic, which
roughly correspond to Levi-Strauss's distinction between statistical and mechanical models
[.Levi-Strauss, 1965.].

Stochastic simulations generally have a probabilistic component, though not usually
unconditioned probabilities (eg sampling from the normal distribution, poisson distribution
or even catastrophe spaces where the probability density changes with different paths or
other spaces, for that matter). Stochastic simulations generally must be performed a
number of times until there is an adequate sample of results to evaluate, since any single
‘run’ of a simulation will have one of many possible outcomes. This sort of simulation is
often used as a component in demographic or ecological simulations where many of the
components are modelled using statistical criteria. They are especially useful where much
of the data upon which they are based has been amenable to statistical analysis, or where
the local behaviour of many systems is only understood in statistical terms, or where it is
convenient to model many of the components using statistical or probabilistic models,
which is desirable much of the time, even if it is not in theory necessary. It is, for example
much easier to model rainfall or warfare with a simple probabilistic model (sampling from
an empirical distribution), rather than going to the trouble of an elaborate model which is
itself extremely complex, simply because we want to estimate the impact of floods or the
loss of male population due to warfare as an independent context for our central problem.

Deterministic simulations are those where one solution set exists for a given input
situation. It primary use is extrapolate and evaluate outcomes given hypothetical inputs, or
to examine the interaction of a number of inter-dependent deterministic models. Depending
of the form of the form of the model used, it may not be by some definitions a simulation
proper, but I am including any animation of models within this category.





  Simulation 7

Polyvalued simulations have more than one outcome for a given starting point, but
probability plays no identifiable part in the multiplicity of outcomes (either due to lack of
knowledge or because we are not concerned with probability); there are simply many
outcomes, representing operations which are not functions. With this type of simulation
we are usually concerned with the set of solutions as a whole. This kind of simulation
could be used to explore all the possible marriages that might exist in a specific population,
and the different structural outcomes of each marriage. This allows us to at least determine
the boundaries of a system.  It is useful in situations where a number of different rules
could apply in a given context.

What can we gain from using computer simulations in our research? Simulation is
appropriate where we can reasonably model the circumstances and context of some
complex behaviour, and want to explore and evaluate models of that behaviour. For
example, we are sometimes concerned with the plausibility of some practice of some stated
rules or preferences, say a preference for marriage to FB{SD}. One of the ways we can
explore a system of this sort is to examine some model under various known
circumstances to compare our own data to. In the above case we can model a population
demographically, state some rules, preferences, and conditions for marriage, and examine
some of the outcomes of applying these to the model population. It is important when
using simulation as a part of analysis to avoid two errors posed by Dyke (1981:202-3).
His methodological error refers to the process of elaboration of simulations to improve
their ‘realness’ to the point that it is difficult to ascertain the impact any of the simulation
elements. His heuristic error refers to concluding that ‘life’ mimics the simulation. The fact
that a given simulation might conform very closely to observed situations does not argue
that these operate on the same principles, only that there is some degree of logical
similarity between the processes in the simulation and the processes of the simulated
system (Dyke 1981:203).




  simulation_3.gif
simulation_2.gif

Figure 2. Distributed simulation model.





  Simulation 8


1.5 Qualitative Simulations

Often in social anthropology we have data which is impossible to quantify or sample,
where we cannot give precise counts or parameters, or even meaningful probabilities.
Because of the nature of collecting ethnographic data, data is recorded as it arrives, and
only contextual information can be systematically collected in a form that could represent a
proper sample from which a probability could be estimated. However these qualitative
collections are the essence of ethnographic collection. They apparent serve most of the
needs of social anthropologists, and we do not tire of attempting to analyse them. Most
simulation techniques are based on numerical and/or statistical criteria, and make little
sense in analysis of marriage rules, sections, totemism or dreams.

It is possible to simulate using only qualitative objects and structures, although the use and
evaluation of these simulations are somewhat different, and in many ways more complex
than quantitative simulations, especially with respect to validation. In a qualitative
simulation we have at best categories as parameters (although these may be expressed
based on a probability), and the simulation focuses on the relationships between categories
and objects within the simulation; where the structure is as important, or more so, than the
content.

There are two basic approaches for qualitative simulations. The first is fairly conventional,
which simply consists of transforming an input structure into an output structure. This
corresponds to the usual design of a quantitative simulation. The second approach is by
resolution. With resolution we are simulating a system of rules and conditions, and using
the simulation to answer specific questions by resolving whether, given the information in
the simulation, a specific situation can occur. In the former, we alter the input structure to
generate different output structures, and evaluate these output structures. With resolution,
we hypothesize different possible output structures, and the simulation tells us whether
they are possible or not within the model of the simulation.

1.6 Design of Simulations

Simulations usually require programming of some sort.  Because any computing language
that we might use has a limited set of resources available for representing data,
information, relationships and transformations that can be applied to these, we must
carefully design our simulation. Because we are first interested in a specific problem it is
important that the design develop according to the needs of the research, and not from the
needs of the computer. Although we will ultimately be limited by the resources available to
us on the computer, it is better to make the simplifications and compromises at the later
stages rather than the earlier ones of the design. In this way we have considered our
requirements in detail, and can better judge how to solve the more mundane problems of
implementation. Even if you will not ultimately write the programming code for the
simulation yourself, you need to be able to describe it in the terms of stages A, B and C
below for the programmer.





  Simulation 9


A) At the earliest stages of the design we want a simple statement of the problem and the
  objectives for which we want to construct simulations. This should be a prose
statement in human readable form which clearly states the meaning and intent of the
elements that will be represented in the proposed situations. The purpose of this step is
to begin to lay out clear criteria for judging the adequacy of later stages, since these
should ultimately be a representation of this initial statement. For examples sake we
begin with the following simple problem: we want to assess the effect of different
constraints on mate selection on the structure of monogamous marriage in a closed
population. Constraints will include age, status and group or kinship relations. The
kinds of structures we are interested include age structure, relatedness and the
proportion unmarried. For purposes of example we will follow a cohort for ten years.

B) The second step is to select the conditions for a specific simulation that will contribute to
  the statement of purpose. For the first simulation, we will select the effect of age
constraints on marriage structure. Here we need to state fairly precisely exactly what we
will need to write the simulation. In general we need a list of the agents or objects to
which actions or transformations will be attributed, the relationships that can exist
between different agents/objects, rules of interaction and transformation, and the kinds
of information that will be required. Perhaps more important we need to specify what
conditions will constitute the stages of the simulation, what information we need to
extract from the simulation, and when. For the example we can use:

  Agents: people Relations between people: marriage
Constraints on marriage: male should be 18 or older, and female between 13 and 18,
  but at least two years younger than male. A male and female can only marry if both
are unmarried.
  Information required about persons: age, sex, marital status. Information derived about
  persons: proportion married.

C) The third stage is to decide how to represent the elements described in stage B). Here
  we are getting a step closer to the translation into a computing language, but we still
want to remain a bit aloof at this stage with respect to which programming language.
However all existing programming languages share a great deal with a general strategy
for representation.

Explicit object definitions are usually represented as a set of categories, and instances of
objects are usually represented as a set of values for these categories. For example we
might define the object type PERSON as SEX, AGE, and MARITAL STATUS, and a
specific instance of PERSON as male, 14, unmarried. In other words we define a data
type PERSON, which will describe how to access and interpret information about specific
PERSONs. These explicit or simple objects are the basic means programming languages
use to represent data. Some objects like PERSON consist of several categories, some
consist of a single category (which is usually its name), and some have complex
categories, which contain not simple values, but a reference value that permits us to locate
the information in another object. We will look at references in the next example, but we
could have a reference to spouse in the MARITAL STATUS category, instead of the
simple married/unmarried value. In this way we would have access to information about
spouse as well as ego. However, because of our simple objective in this case, to find the





  Simulation 10


proportion of married to unmarried, we do not require this information. Only information
that is required need be considered, since our PERSON in this case is a model of a person
as required for the objective.

Because of the simple means of representing the married/unmarried relation, we will not
have any interperson relationships, so the representation of the marriage relation is simple.

The constraints on marriage choice will be given as rules. In this case something like: IF
age of male greater than or equal to 18 and the female is between 7 and 5 years younger
than male then marriage is ok, otherwise not.

Finally we have to consider what kind of information we want to get out of the simulation.
In this case it is proportion of married to unmarried males and females, and probably more
specifically the proportion of eligible males and females. The former is fairly easy to
accomplish: we need only count the number of unmarrieds before we attempt to marry
them off, and compare this to the number of unmarrieds afterwards. Specifying the
proportion of eligible males and females is a bit more complicated, especially if we are
strict about eligibility, since eligibility could be construed to be dependent on the prior
existence of a male or female of the proper age. A weaker constraint would be to select 18
for males as the eligibility age, and 11 for females. This weakening is reasonable, since we
are in a sense doing the simulation to find out the eligibility rate, but there is value in the
stronger constraint as well, since an 18, 19, and 20 year old male can marry a 13 year old
female, but a 20 year old could also marry a 15 year old female, while the 18 and 19 year
old could not. Thus the 20 year old can eliminate a possible mate for the younger males.

This latter point raises an important issue about the process of making the marriage
decision. How are we to decide the order of choice among the males. This is an issue that
emerges over and over in even the simplest simulations or even especially in the simplest
simulations, since they often have the most simplified decision models. Sometimes we can
decide to use a simple principle, such as oldest (or higher statuses etc.) choose first. It is
best if there is some ethnographic evidence for these kinds of principles, but they can be
used without if you are willing to accept the bias that is introduced. Another choice,
especially if one is intending to run the simulation several times, is to randomly apply the
decision, perhaps with a bias towards some principle, say older males are more likely to
choose before younger, but not necessarily. In our simple example we will choose the
oldest first principle, but will examine the randomized approach.

1.7 Implementation

The examples are represented in the programming language Prolog (see Chapter
«Kinprog» for description of basic features; Brako 1986 is a good introductory text.).
Prolog has a number of properties that makes it very useful for qualitative and non-
deterministic simulations, and is weakest in quantitative simulations. There are some
problems with Prolog, because although we will find it relatively easy to represent the





  Simulation 11

sub-models and their interaction in Prolog, it is often difficult or cumbersome to evaluate
the results.

We can define our basic object type in Prolog by using a person fact:
person(age,sex,status) which we repeat for every person in our population, where age is
either a numerical age, sex is male or female, and status is married or unmarried. These
could be read from a data file, created by an initializing program module (in Prolog a
program module is called a predicate) , or typed in. We could then use a very simple (and
unrealistic) marriage rule, ‘each unmarried male at or beyond the age of 18 will marry the
first female encountered  who is 11 years of age or older  and who is between 5 and 7
years younger than the male’.

  /* Database */

  person(abdul, 24, male, unmarried).
person(rubina,18,female, unmarried).
/* ... */
person(zarina, 22, female, unmarried).

  /* Rules */

  marry_all :- /* marry all eligible people */
  marry(Male, Female), /* marry a couple */
  fail. /* forces evaluation of  next couple */
  marry_all. /* so that marry_all will succeed  after all marriages  */

  marry(Id_male, Id_female) :- /* marry people if eligible */
  eligible(Id_male, Id_female),
change_marital_status(Id_male,Id_female).

  eligible(Id_male, Id_female) :- /* check eligibility for marriage */
  person(Id_male,Age_male,male,unmarried), /* unmarried male*/
  person(Id_female,Age_female,female,unmarried), /* unmarried female */
  age_check(Age_male,Age_female).

  age_check(Age_male,Age_female) :-  /* check to see if ages are compatible */
  Age_male >= 18,
Age_male - Age_female <= 7,
Age_male - Age_female >= 5.

  change_marital_status(Id_male,Id_female) :- /* change from unmarried to married status */
  retract(person(Id_male,Age_male,male,unmarried)), /* remove old entry */
  assert(person(Id_male,Age_male,male,married)), /* add updated information */
retract(person(Id_female,Age_female,female,unmarried)), /* ditto */
assert(person(Id_female,Age_female,female,married)).

The predicate marry_all will attempt to marry everyone in the population according to the
defined criteria. This does not mean that everyone who is eligible for marriage will be
married at the end, because of demographic restrictions of the initial population. One





  Simulation 12

problem with this example, especially from an anthropological perspective is that all males
and females are interchangeable with all other males and females respectively. There is no
mechanism to take account of kinship or other relationships, not even such primitive
aspects such as sibling-hood! We can accommodate by adding avoidance for half and full
siblings: person(Id,Age,Sex,Marital,Father,Mother)

  marry(Id_male, Id_female) :- /* marry a couple */
  eligible(Id_male, Id_female),
change_marital_status(Id_male,Id_female).

  eligible(Id_male, Id_female) :-
  is_male(Id_male),
not(is_married(Id_male)),
is_female(Id_female), not(is_married(Id_female),
  not(are_siblings(Id_male,Id_female)),
  age_check(Id_male,Id_female).

  is_male(Id) :- person(Id,_,male,_,_,_).
  is_female(Id) :- person(Id,_,female,_,_,_).
  is_married(Id) :- person(Id,_,_,married,_,_).
get_age(Id,Age) :- person(Id,Age,_,_,_,_).
get_father(Id,Father) :- person(Id,_,_,_,Father,_).
get_mother(Id,Mother) :- person(Id,_,_,_,_,Mother).
are_siblings(Id1,Id2) :- get_Father(Id1,Father), get_father(Id2,Father).
are_siblings(Id1,Id2) :- get_mother(Id1,Mother), get_mother(Id2,Mother).

  age_check(Id_male,Id_female) :-
  get_age(Id_male,Age_male),
get_age(Id,female,Age_female),
Age_male >= 18,
Age_male - Age_female <= 7,
Age_male - Age_female >= 5.

  change_marital_status(Id_male,Id_female) :-
  retract(person(Id_male,Age_male,male,unmarried,Mf,Mm)),
  assert(person(Id_male,Age_male,male,married,Mf,Mm)),
  retract(person(Id_female,Age_female,female,unmarried,Ff,Fm)),
  assert(person(Id_female,Age_female,female,married,Ff,Fm)).

From here we can elaborate the code further to include a absolute preference for FBD (eg if
a FBD is available marry her, else marry someone else) by replacing the marry predicate
with the following predicates:

  marry(Id_male,Id_female) :- marry_fbd(Id_male,Id_female).

  marry(Id_male,Id_female) :- marry_other(Id_male,Id_female).

  marry_fbd(Id_male, Id_female) :- /* marry folks */
  is_fbd(Id_male,Id_female),
eligible(Id_male, Id_female),
change_marital_status(Id_male,Id_female).





  Simulation 13



  marry_other(Id_male, Id_female) :- /* marry folks */
  eligible(Id_male, Id_female),
change_marital_status(Id_male,Id_female).

  is_fbd(Id_male,Id_female) :-
  get_father(Id_male,Mf),
get_father(Id_female,Ff),
are_siblings(Mf,Ff).

From this we can see that we can alter and elaborate the model to represent what we want.
For example, if we want to include the consideration of obligations, we first need a model
of obligation, then a representation of obligation, and finally a check to see at the time of
the marriage decision if an obligation might affect marriage choice. Along the same lines,
we might want in the code above to add a deference of the marriage decision until some
upper age boundary, say 25. If the male is not already married at 25 he must marry
someone, even if a fbd is not available. Or we could add a routine to see if there is the
prospect of a fbd becoming available, and if she has an obligation to marry him. In other
words, if we can model and represent some aspect of the situation, it can be included in the
simulation. One obvious problem with the above example is that we have provided no
method of actually monitoring or otherwise getting information about what is happening.
Before the simulation is designed it is important to decide what information you are
seeking to answer which questions. As with other computing applications, the simulation
is a transformational method for relating Input to Output. A difference here is that we are
interested in the set of transactions that lead to this transformation.

Monitoring the simulation depends on what data is affected. In this case we have limited
data to monitor, since all that is changing is the marital status, and we are not recording
who the marriages are to. For example, we can count the married and unmarried people,
by gender and age group using the following:


  count_people(Gender, Low_age, Hi_age, Marital_status, Count) :-
  retract(total(C)), fail.
  count_people(Gender, Low_age, Hi_age, Marital_status, Count) :-
  person(Name,Age, Gender, Marital_status, Father, Mother),
Age >= Low_age,
Age =< Hi_age,
total(C),
retract(total(C),
C1 is C + 1,
assert(total(C)),
fail.
  count_people(Gender, Low_age, Hi_age, Marital_status, Count) :-
  total(Count).

The first definition of count_people deletes any prior count. total could be any name. It
then fails so that the second definition will be tried. The second definition matches the





  Simulation 14

criteria you are counting by; eg ‘count_people(female,15,20,married,Count)’. It fails at the
end so that the next person will be examined. The third definition simply reports the result.
This structure works because Prolog always tries the next possible case if it fails. Since we
record a case that matches before we fail, we can keep a count. retract and assert are
Prolog predicates which add and remove ‘facts’ from the database.

If we were to keep track of who was married by extending the person structure, then we
could also count the number of fbd who were married.

1.8 Building Blocks for Simulations

There are usually a number of different models at work in a given simulation, and indeed
this could be taken as a functional definition of simulation: solving a problem by the
interaction of at least two models. A simulation is then composed of a number of building
blocks, whose properties we more or less understand. These models are themselves
arranged in a larger interacting model, which represents the larger context of these models.
A simulation provides a proving ground for these sub-models. This complicates the
general usefulness of simulations, since the results that arise from a simulation are only as
good as all of the models that the simulation is built from. This can make the validation of
the simulation very problematic indeed.

Validation of a simulation centres on two different aspects: the sub-models and an
evaluation model. The sub-models must be independently validated to establish that they
behave as specified. The evaluation model is used to establish that the simulation does or
does not have specific validity.

The results of any simulation is, of course, only a descriptive model. Any explanatory
power that it might have must be argued based on points that are outside the simulation
proper. In anthropology this is generally ethnographic data. As with a model, any
simulation is a simplification of the system under study, and in many cases does not even
represent any 'real' system at all, rather the simulation is intended to generate model data
for an 'ideal' world, which we can then compare our data to, noting where it corresponds
and departs from the ideal world. This is a useful technique, especially in the early stages
of analysis, since it can be used to establish a sense of how important specific aspects of
the context are to the analysis of the data.

The nature of these models used in the simulation depends very much on what the
objectives of the simulation are. If all we require is a model that behaves correctly with no
explanatory pretensions what so ever, then our job is relatively easier. This is generally the
case for any sub-model whose behaviour is independent, or can be modelled as
independent, of the rest of the simulated context. This includes structures such as rain and
weather in general, the passage of time, and the presence or absence of game, fish, honey,
or other ‘natural’ resources. It can also include other sub-models, if they are not the
principal object of study. Thus we can include crop yields in this category if we are not
interested in the micro-mechanics of corn growth. What we care about in the simulation is





  Simulation 15

that given specific environmental conditions, specific inputs, and human labour we will
expect a corn yield. In many simulations all of the sub-models can be descriptive
behaviour generators, if we are principally interested in their interaction.

1.9 Generating Behaviour: Independent Events

If we simply need to generate events, such as rainfall, which are independent of other
elements in the simulation then a statistical/probabilistic model is often best, especially if
the simulation is one which will be run many times to establish its overall behaviour. Thus
is usually the case because most social situation involve a great deal of uncertainty, and
often the only sensible method of investigating them is to look at a range of solutions. If
the event we want to model has a numeric value, such as rainfall, and we have several
years of recorded data for rainfall, we can often simply use the mean and standard
deviation of the rainfall to generate a value  If we only require a few categories, we can
break the probabilities into a table, and select from that. Different distributions suit
different kinds of data. For example, disease events are often better sampled from a
possion distribution than a normal or binomial distribution.

For example, we can elaborate the simulation in §1.7 by operating it on an annual cycle.
To do this we need to do three things. First we must age everyone one year, we must have
some mortality, and we must have some births. Aging is easy:

age_people :-
  person(Name,Age, Gender, Marital_status, Father, Mother),
retract(person(Name,Age, Gender, Marital_status, Father, Mother)),
New_age is Age + 1,
asserta(person(Name,New_age, Gender, Marital_status, Father, Mother)),
fail.
age_people.

The others can be simulated in simple cases by applying probabilities of births to women
of child-bearing age, with appropriate weights for married and unmarried women, and
applying mortality to each person based on age.

1.10 Evaluating results

The results of a simulation can be evaluated in a number of ways. If the simulation is
basically an empirical one, which has a number of random or statistically generated events
within it, we can often evaluate the results by using a statistical test such as chi square.
Many times though we are principally interested in using the data to establish some point
for which we don't have direct data. Here we are exploring the structural possibilities
given what we do know. Simulations are useful for 'what if' situations, where we are
attempting to extrapolate from what we do know to areas with which we have little or no
data. One method of some use in evaluating simulations is examining its structural
stability. This is useful where we are (sometimes grossly) estimating a number of values





  Simulation 16

for the different models because the information is simply not available. This is common in
ethnography, because of necessity we collect an account of events that are idiosyncratic to
the time which we are in residence, in most studies less than two years, and often less than
a year. We have however some confidence that the behaviour that we observe during our
tenure in the situation is not simply idiosyncratic behaviour. The particular events and
situations we observer are, but we assume that the responses to these is derived from
general principles of the society, and this is usually the object of an ethnographic analysis.
Simulation can give us an opportunity to validate some of these assumptions and analyses,
because with simulation we can create contexts and situations that did not occur during our
study. If the various 'solutions' we find in the social group are likely to represent general
processes, we expect that they will work, and the social group will survive in a wide range
of possibilities. For example, agricultural practices that lead to the loss on one crop in three
are not likely to be considered successful, and probably represent at least a situation where
we did not collect enough data. While we can't be sure of our simulation model, we can
establish the various limits which the simulation can adapt to. That is the various points at
which it breaks down.

 
2 Expert Systems and Anthropological Analysis

The idea of using an expert system, a computer program that simulates a human expert
(i.e. an informant) in anthropological analysis has been received by anthropologists with
some interest, but with more caution (Davis 1984:3). This caution is justified because to
most anthropologists the inner workings of the expert system are not known; they are
black boxes.  But anthropologists should be interested in a model that claims to represent
and use human knowledge, if only to evaluate that model.  This section describes some of
the basic assumptions in contemporary expert systems, discusses their usefulness to
anthropology, and concludes that many existing expert systems are of limited interest to
anthropologists, although the general model underlying expert systems can be used
productively.

2.1 Introduction

Artificial Intelligence (AI) is a multi-disciplinary area in which the goal is to represent
intelligence (usually human intelligence) in the modelling environment of a computer.
There has been research in Artificial Intelligence since there there have been computers.  It
was believed in the ‘fifties that ‘just a few more years’ would bring about a revolution in
AI, but those few years have receded annually 20  In the past decade there have been
developments in AI that are considered by AI researchers (and others) to be partial
successes, among these is the expert system.  Expert systems are computer-based models
that simulate human expertise in a specific area (domain), such as a subset of medicine
(Shortliffe 1976) exploratory geology (Duda 1978), or education (Clancey 1981).  Expert
systems are claimed by AI researchers to be an important advance, and some claim
implications about models of human representation of knowledge, and mechanisms of
inference. (Barr 1982).





  Simulation 17


There is a small but growing literature on the use of expert systems in anthropology.
Besides Kippen (1988), described in §1.1, Brent (1988) has developed an expert system
to assist in statistical analysis. Furbee (1989) describes an expert system for ‘folk’
classification of soil in the Colca Valley  in Peru. Read and Behrens (1992) describe a
simple expert they developed in 1987 in which they modelled decision making about terms
of address used by Bisayan speakers in the Phillipine Islands adapted from Geoghegan
(1971). Fischer and Finkelstein (1991) wrote a production system which simulated
evaluating a potential marriage partner in an arranged marriage in the Panjab, Pakistan.
Benfer et al (1991) is a good anthropological introduction to expert systems.

2.2 Qualitative and Quantitative Analysis

 Qualitative analysis can be defined as identifying qualitative structures, identifying the
states of those qualitative structures, and the pattern of changes (transformations) in those
states 21.  Quantitative methods can sometimes be used to aid this process, but usually
qualitative methods are exclusively used for the analysis of qualitative data and structures
for which quantities proper are difficult to define.

Thom  (1975) argues that all quantitative analysis assumes a firm qualitative foundation.
Before they measure, people must agree that there is something to be measured, and that is
a qualitative judgement. Similarly, people must agree that the measure (metric) they use is
appropriate, and applicable to other phenomena 22.

As an example, consider per capita income.  It is apparently easy enough to agree on the
structure, but the metric is another issue. If currency is used as a metric, a poor family in
the United States would be a wealthy one in Pakistan.  The metric can be further adjusted
by considering cost of living, but an acceptable level of living in the United States is not
equivalent to one in Pakistan.  The problem is not difficult to understand qualitatively;
there are different standards in these two places. The two countries’ per capita income can
be compared quantitatively, but the interpretation of the comparison is qualitative.  The
quantitative analysis is more difficult to reconcile, and indeed is undecidable without
reference to qualitative structures in the two societies.

In most cases quantitative analysis depends on continuity. To quantify a phenomena
meaningfully it is usually necessary to assume that the relation between phenomena and
metric can be described by a continuous function 23, since a primary goals of
quantification is to provide a basis for comparison.  For phenomena where the analytic
focus is on states this is often misleading or impossible.  In most social phenomena there
is no continuous function that can adequately describe the important qualitative
relationships. As an example consider income and education. These are variables which
are of ten given a quantitative definition in social research.  They are relatively easy to
define, and people generally measure income in currency, education in years. But they
often assume linearity is assumed and usually there will be a good correlation between





  Simulation 18

them.  But it will not be a perfect correlation, as one unit change in the independent
variable will not result in some regular linear unit change in the dependent variable. Now
this is not terribly shocking, since people do not expected that all the variation in one
variable is to be explained by the other, but there is benefit in understanding the
relationship between the variables by breaking the relationship into stages, and examining
the conditions for moving from one stage to the next.  For instance, in the U.S.A. 11
years of education is minimally better than 10 years, but 12 years is far better than 11. This
is due to the local structure of American education; 11 years is pre-graduation, and 12
years is post-graduation. The graduating student has a qualitatively changed educational
status, the pre-graduating student has not significantly changed status.  This type of
analysis helps to give a better account of interactions.

Another reason quantitative analysis must depend on qualitative analysis is illustrated in
Figure 3.The graph shows hypothetical data g and two solutions fitted to that data.
Solution 1 is the better qualitative fit, as the relative shape appears to be the same as the
data, but is not as good a fit quantitatively as Solution 2. Solution 2 fits well quantitatively,
but probably describes a different underlying mechanism altogether.


simulation_5.gif
Figure 3. Two models of g. (adapted from Thom 1975 )

2.3 Expert Systems

An Expert System is designed to simulate one aspect of a human expert; the ability to
classify phenomena from a set of attributes.  The expert system is a classification engine.
It  is a system that takes information about a particular case or instance within the domain
of the system and produces a qualitative result (or goal state).  It usually has incomplete
information, and makes qualitative judgements based on this information.. Expert systems
are defined in terms of algorithms in a computer program plus relationships established by
a human expert. This will be interesting to anthropologists if three conditions re met: the
computer should arrive at the same conclusions as a native expert; it should arrive at the
same conclusions as an anthropologist; and it should do useful jobs.

Expert systems, as a class of computer programs, are currently designed tot reflect a
general model current within the artificial intelligence community; an expert system is not





  Simulation 19

simply a simulation of human expertise, but must be implemented (on a computer) in a
particular fashion; it is a product of an AI culture.  Ideally an expert system has two
primary components (see Figure. 4):

The Knowledge Base.  

The Knowledge Base is essentially a set of rules describing relations between elements in
the domain of knowledge. In the simplest form:

  [condition(s) Æ outcome(s)]

In spite of this notation causality  is  not  assumed. The rules for deriving an outcome from
a set of conditions are always formulated externally by an expert usually aided by a
knowledge engineer,  i.e. a specialist in transforming the expert’s information into
statements suitable for a knowledge base.  The knowledge engineer stands to the expert as
anthropologists do to their informants. The knowledge that is selected for inclusion in the
knowledge base can have a variety of forms, depending on the form of the inference
engine.

Most expert system designers consider it important that the rules be easily inserted,
modified, or deleted from the knowledge base, in any order.  They usually consider the
rules to be weakly connected: there is no sequencing information about the order in which
they can apply, and the only connections between them are the use of common terms of
reference. Thus if one rule determines that a person’s residence is patrilocal, and another
rule can use that residence information to draw further conclusions, the rules are
connected.

The Inference Engine.

An Inference Engine is a method of using the rules in the knowledge base to derive a
conclusion. Using the simple knowledge representation above this might take the form:


if condition then add outcome to the context

Where outcome is the conclusion if condition is true, and context is an area where
knowledge is recorded to determine if conditions are true.  An outcome is often part of
another condition that matches another rule. In other words, the inference engine takes the
rules provided by the knowledge base, and uses internal rules of inference to draw a
conclusion. The claim is that the internal rules are general to all inference. So the inference
engine is a set of rules which are applied to the rules in the knowledge base.

The inference mechanism is thus critical to the outcome; it is responsible for any
interrelation of elements beyond the rules in the knowledge base. It is usually based upon
some variant of logic, such as first-order logic, fuzzy logic (Zadeh 1975), modal logic





  Simulation 20

(Zeman 1973), or intuitionistic logic (Martin-Löf 1982), and also usually employs some
statistical mechanisms for measurement and classification.  

The inference engine is intended to be based on a general model for using knowledge and
should not have special knowledge about a particular domain. This model is claimed to be
unlike the usual computer program/model structure, because the specifics are separated
from the methods.  This distinction is made for at least two reasons:

a) It makes possible system expertise in different domains by modifying the knowledge
  base without modifying the inference engine.
b) AI researchers assume that in humans knowledge and inference are separate activities
  and that inference is prior to knowledge. Hence it  is theoretically consistent to separate
the two in the computer model.
  (Derived from Barr 1982)

simulation_6.gif
Figure 4. Expert System Schematic

In most existing expert systems the knowledge base and inference engine are not terribly
complex in design.  The knowledge base determines the set of possible outcomes that the
system can consider and the  rules for arriving at those outcomes. Although this requires
great effort on the part of the human expert and the knowledge engineer the form of
representation is quite simple.

In many systems both outcomes and rules have an objective or subjective probability
associated with them, again derived from the human expert.  The knowledge base consists
of high-level structures derived through the formidable pattern matching and inference
skills of humans.





  Simulation 21

An inference engine has three parts; an identification mechanism, an evaluation
mechanism, and a goal mechanism.  The first two constitute the inference mechanism
proper, and the third is for finding efficient paths to an outcome; it does not strictly affect
the outcome (unless it is poorly designed), but it selects the best  condition to request data
on rather that requesting all possible conditions in the knowledge base. So the goal
mechanism is a search pattern through the possible conditions that apply to a case, and it is
the goal mechanism that gives the expert system the appearance of performing like a
human expert by requesting a minimum of information.  The inference mechanism gives
the expert system the judgement to announce a result consistent with the knowledge base.
Most of the successful (externally validated) expert systems use some form of probabilistic
model (often Bayesian) as the basis of the inference mechanism, using the probabilities
associated with the knowledge base.  One common goal mechanism works by finding the
goal that is most likely to be true at the current time, and then finding the condition that will
give the most information about that goal (as defined by the evaluation mechanism).

2.4 A simple example

Consider the  factors that influence the marriages arranged by urban Punjabis of Lahore
(Fischer 1991a; Fischer 1991b).  Marriages are arranged in the Punjab by the parents and
other relatives of the potential groom or bride.  The following factors (not necessarily in
this order) appear to be the most important in the evaluation of a possible spouse :

  1 zat   (sometimes glossed as caste)
  2 jihez   (dowry)
  3 intellect
  4 education
  5 haq  mehr  (bride deposit)
  6 beauty
  7 izzat   (honour, respect, responsibility)
  8 baradarie   (clan)
  9 rishtidar   (relative)
  10 distance   (from natal home)

These are not Panjabi selection criteria, but an anthropologist’s measurement or probe of
the semantic domain of selection derived from what Panjabis say.  In addition, the
selection is influenced by the size of social networks and the availability especially of
females, who are supposed to be invisible before (and after) marriage except to relatives.

The relationships between these measurements are quite complex, and they are evaluated
relatively.  For instance, if the  zat  of two candidates is different, then what constitutes
enough izzit will be different in each case.  In other words the state of enough izzit varies,
depending on at least one other value. Amounts measured are not evaluable without other
context; there is a high degree of relativity.  Moreover it is probable that different people
have different selection models, and one person may have more than one.





  Simulation 22

To construct an expert system based on this situation:

1 What will the expert system do? Give a statement of the suitability of possible marriage
  partners.

2 How will the expert system do it? This is a fixed solution (relative to a particular
  inference mechanism), since an expert system uses the same inference method
regardless of the knowledge domain. Initially assume a simple mechanism; internal
rules derived from examples of previously considered marriages given a suitability
judgement by local experts.  These rules can be derived using a statistical mechanism
which weights the effect on each marriage of each of the factors.  In essence  the rules
treat each factor as a dimension in a multidimensional space, and locate each qualitative
state (suitable/notsuitable) within that space, given a value for each axis.  When the
expert system is consulted, the evaluation mechanism will test to see if the input factors
required by the inference engine are within a statistically significant distance from the
internal rule-derived values.  The goal mechanism will find the factor that makes the
biggest difference in continuing evaluation, and ask for that information.

  This is known as forward chaining because it works from factors to outcomes. Many
current expert systems turn the above goal  mechanism  on  its head or side, called
backwards chaining and sideways chaining respectively.  Backwards chaining  is
favoured  for  systems  that have a large number of outcomes, much like the above
example if  all the  individuals  in the marriage universe are included as part of the
knowledge base. In this type of  system, the  expert  system would start attaching
probabilities to each person in the  base,  and  finding  information that would remove a
person from consideration.  This is called backwards chaining because it works  from
solutions  to factors, and appears more purposeful.(Nilsson 1982) In this case a person
is the outcome rather  than a  simple  yes or no.  Sideways chaining works a bit on both
principles,  finding  both  weighted  factors  and weighted solutions.

3 What kind of data will it require? The data is dependent on the kind of inference
  mechanism used.  In this simple case the data will be of the form:

  marriage {value of factors 1-10}

  where the value will have already been weighted by the human operator; in terms of too
little, too much, enough, where appropriate, yes, no, same, and different. The
weighting in this example is assumed to always be from the son-giving side. This
would give us a knowledge base like that in Figure 5:





  Simulation 23


Factor Marriage 1 Marriage 2 Marriage 3
zat same different same
jihez enough too low too high
intelligence enough too high too low
relative yes no yes
education too low enough too high
haq too low too high enough
beauty enough enough too little
izzit enough too high too low
bradarie same different same
location too far ok ok
suitability yes no yes

Figure 5.  Example measurements for marriage model.

In consultation the expert system takes in the knowledge base, creates internal rules, and
answers the request, which would be for the suitability of a possible marriage. (fig 3) To
derive an answer it asks the user to give values for some of the factors until it is possible to
determine the qualitative result, and then makes a pronouncement, yes or no. Note it can
not ask the suitability question itself, as this is the purpose of the system, but requires this
for the knowledge base input to form the rules.

2.5 A knowledge-based example

The example in §2.4 is fairly easy to follow, but has little depth because of the immense
amount of analysis that is needed to set it up; for it to work it must be told to seek the
correct information (the selected criteria), and that is known only after analysis.  It also
fails to take into account any higher-level ethnographic or ethnological knowledge, it is a
purely descriptive model with no explanatory power. Additionally, the particular method
described is heavily committed to a particular model in the formulation of rules, and
assumes that the results are linearly differentiable; that is, that each state has a unique
coordinate range in the multi-dimensional space.

Most expert systems incorporate higher-level knowledge, in the form of explicit rules in
the knowledge base.  The previous example can be greatly improved in performance by
adding rules of the following type to the knowledge base:

1) if zat is same then izzat is enough.
2) if bradarie is same then izzat is enough.
3 if relative is yes then zat is same.
4) if relative is yes then bradarie is same.
5) if distance is too far and relative is yes then  distance is ok

and so on.  These kinds of rules add information about factors that cannot be taken into
account in a regular, statistical method.  One might ask why the entire system could not be
based from rules like these, freeing the system of dealing with deriving rules from
empirical data altogether.  The answer is that one can, and most working systems do.





  Simulation 24

However, although the rules appear to be ‘higher-level’, they are no less empirical with
respect to the expert system, and provide no explanation for the outcome that is not in the
rules to begin with.  This defect is usually overcome in expert systems by the expert and
the knowledge engineer adding comments to each rule so when a user inquires about the
reason a particular conclusion has been reached, comments are displayed for each rule in a
successful derivation of the conclusion.

We need not limit ourselves to such simple kinds of factors. For example, consider some
rules adapted and simplified from Fischer and Finkelstein (1991):

if girl is immoral then marriage is not a good risk
if mother is immoral then daughters are probably immoral
if girl is immoral then younger sisters may be immoral
if girl plays suggestive music then girl is immoral

believed: ‘girl played suggestive music’
conclusion: ‘marriage may not be a good risk’

Even in this simplified model it is clear that much of the complexity of the computing
component is in the goal mechanism, which ideally has no analytic effect on the final
outcome, whereas the inference mechanism is relatively simple, using models that are
more or less in common usage in descriptive analysis.  In spite of this many expert
systems do often succeed in making judgements consistent with the human experts they
are based on (Michie 1982).  They achieve this by representing knowledge as a set of local
models, made up of one or more rules, that are only weakly (and informally) interrelated,
rather than by have a single large formal model of the expert’s knowledge.

Of course the degree of interrelation varies from system to system. For example in most
learning systems, the initial set of structures it is told to learn about have  been carefully
selected to be independent of each other statistically.  In input rule based systems,  the
rules will have been carefully selected.  Most successful systems have undergone an
enormous amount of tuning and pruning to achieve there results, using rules similar to the
latter example.  But the point remains  that the knowledge base consists of a large number
of conditions and outcomes, and are not generally arranged in a deterministic  structure  by
the  human expert, rather they represent bits of information that  are  connected by  the
sense of relevance that the human expert gives them.  It is the inference engine's role to
reconstruct this  relevance.   Both  the former and latter style of knowledge base share the
same assumption: that there is each  outcome  has some non-intersecting set of derivations
with respect to other outcomes.

Most current expert systems also have a probabilistic component.  The knowledge base is
for the most part entered in the form of ‘higher-level’ rules, but objective and subjective
probabilities are attached to the conditions and outcomes by the human expert. This is one
way to allow the derivation of the outcome to be partial; the outcome need not be
absolutely defined with respect to the knowledge base, only defined to some arbitrary





  Simulation 25

degree of probability.  This greatly amplifies the capacity of the expert system to classify,
since it is not restricted to finding exact matches to what has been encountered before, but
rather comparing as prototypes, simulating the capacity of human experts to make
judgements on new cases.

There are several ways to account for the success of current expert systems.  First, since
the local models as presented to the expert system are only descriptive models, and the
overall system is a performance model, no internal explanation need be generated; the
expert system is judged only on its descriptive performance. Second, modern statistical
methods are quite powerful descriptively, so one could expect them to be reliable
descriptors when used.  Third, the knowledge base is created, selected and pruned by
humans and consists of human expert judgements. This is also true of information
supplied to the expert system while it is operating. So it is assumed that the human can
answer the questions asked by the expert system appropriately and correctly.  So in many
ways the success of contemporary expert systems is a sleight of hand; all the human
interaction in the process is taken for granted. But it is fair to say that all the expert system
designer is claim to do is represent the knowledge of a human expert, not to create a
human-like expert.

From an anthropologist’s point of view the rule-based model is preferable to the statistical
one, but makes no difference to the goal of the expert system, which is simply to
descriptively mimic an expert.  No current system can do more; expert system writers
might claim psychological reality (many do not), but that is a far cry from establishing
psychological reality, as the debates (see: Buchler and Selby 1968;, Burling 1969.}] over
the new anthropology of the ‘sixties demonstrate.

Anthropologists may still find possible significance for anthropology in the general model
underlying the expert system.  A model of some major segment of human action need not
be a single large formal model, but a series of weakly interacting local models. If these can
be stated consistently, anthropologists can explore at least descriptively how the models
interact with each other.

2.6 Conclusion

The goal of an expert  system is to make qualitative judgements, to predict the state of a
system relative to contextual data. However it may not be clear how an expert system can
help in qualitative analysis. After all, if you have to provide the model, what is the expert
system doing for you? This is not a fair argument as it applies to any computer based aid.
It does nothing that you could not do given pencil and paper, in ten or twenty years.  The
computer in this role amplifies what can be done.

There are two more serious objections that can be raised.  One is the hidden model
objection.  which rests on what happens in the black box of the inference engine to the
model or data that was entered. This is a problem only if there is no control over the
identification and evaluation mechanisms in the system.  In general the other mechanisms





  Simulation 26

are not terribly important; for example it is not important from an analytic point of view
whether the  goal mechanism is a forward or backward chaining strategy .  That is a
description of how the information is ordered and accessed internally, rather than how it is
evaluated.  However, it is critical to control, or at least understand, the internal evaluation
method, for the analyst is locked into the limited range of possible models that a given
system can accommodate.  This is strictly an issue of access to programming skill (Read
and Behrens 1992:250).

The second objection is to the formal or theoretical basis of the general model of an expert
system.  As outlined above, all current expert systems work more or less upon one general
macro-method; given a list of symptoms and a list of outcomes the systems evaluates the
most likely state(s) (outcome) for the system to take at each point of the analysis.  The
generalised expert system model attempts to achieve this global scope without explicitly
laying down all the paths, rather piecing together a unique solution for each unique
situation, using only a series of small, local models and a general inference mechanism as
the basis.  It does this not by incorporating a single exhaustive model relating all possible
states to each other, but uses individual instances of information and relates them
according to a weak interaction internal model.  There is formal support for the weak
interaction model in mathematics from Thom (1975), and in anthropology and simulation
from Zackary (1980).

The problem with using expert systems in anthropological analysis is created by the split
between knowledge base and inference engine;  in general the non-programmer
anthropologist can only control  the knowledge base.  Regardless of the type of models
that the anthropologist sets up in the knowledge base, the inference model must be known
to evaluate the interaction of the models as anticipated. This makes the system suspect for
analysis unless one knows the inference model in detail, and is satisfied that it realistically
represents the assumptions that must be made.  This objection is not to the general
approach, but to the fixation to a particular global model, the evaluation mechanism.  This
problem is not unique to expert systems, but arises in any use of simulation to test models:
the result of an model must always be tested against another model before it can be
interpreted. The properties of the evaluation model must be known and consistent with its
purpose.  If the problem of control can be overcome then the general expert system model
has potential as a means of exploring the interactions of a large number of local models
towards a set of global responses; a method of qualitative simulation.