pmn

pmn

U-SP: A user-friendly survey analysis package

Philip North

University of Kent at Canterbury

BICA Issue No. 6: September 1987

Introduction

For the past three and a half years a project team in the Applied Statistics Research Unit (ASRU) at the University of Kent at Canterbury, England, has been developing a computer package for the analysis of survey data. The funding for much of this project has been provided by an ESCOR grant from the Overseas Development Administration.

ASRU is a self-financing group which offers a wide-ranging consultancy service in statistics and statistical computing. One of its activities is the running of courses and one of these, mounted annually at Canterbury, is a 3-month course on Sample Surveys in Agriculture and Rural Development (SSARD). It is through the SSARD course that the need for the type of software that has been developed in U-SP became apparent.

Participants on the course are mainly workers from developing countries - indeed, for some time now the statisticians at the University of Kent have had a considerable amount of contact with developing countries - and it was soon clear that they needed survey software which is fully interactive and easy to use (user-friendly), yet which has efficient tabulation facilities for data from complex surveys, and can run on microcomputers. It was found that no suitable software currently existed. The Rothamsted General Survey Program (RGSP), for mainframe machines, was favoured by the course tutors for its methodological soundness. Despite this attraction it was found that the course participants did not take easily to this package, as they found it difficult to understand and to use. Also, the participants came from environments where the computing facilities available to them were typically very limited, though many were acquiring microcomputers. It was from this setting that the project grew. Although the background here is in a particular area of application, the general principles are likely to be much more widely relevant. Many survey workers will welcome software that is easy to use and conversational in form and, increasingly, many workers, especially with smaller surveys, will have easy access to microcomputers - possibly their own. This is likely to be true, for example, in some market research applications.

Features of the package

The first release of U-SP is now available and constitutes a very practical data entry and tabulation package for handling sample survey and census data. It is seen as an important tool for the less glamorous but crucial end of survey work, i.e. the collection, entry, clearing and basic manipulation of data. It can deal with data from surveys with complex hierarchical structures, yet is still easy to use. Development of the package is continuing, and is likely to do so for some time (as one would expect). Future releases will contain analysis facilities for interpreting the tabulated data (e.g. chi-squared statistics, log-linear models, regression analysis), and graphical facilities.

U-SP has very flexible data entry facilities. These allow the user to set up the structure of the survey on the computer, and then to input the data from file or keyboard in fixed or free format, carrying out logical checking of the data at entry. The manner in which the user is led through the specification of the questionnaire is almost in the spirit of questionnaire design and, indeed, the design aspect of the questionnaire, and the survey as a whole, is a feature that it is hoped will be addressed in the further development of the package. Data editing and verification are provided for, and once the data are entered there are extensive data exploration facilities and attractive means of manipulating tables. The final tabular output is of a quality suitable for immediate photographic reproduction.

The whole package is presented to the user in user-friendly style, which has become one of the standard features of software produced by ASRU. The package can be used in expert or non-expert mode. There are extensive help facilities throughout, so that the user can obtain explanatory text (which may be short or verbose, depending on the mode of usage) or lists of options whenever they are required. There is a helpful text recognition facility which is useful if the user mis-types any entries, thus saving on re-typing. This also means that one can shorten each entry to sufficient characters to distinguish it from any other response possible to the current prompt.

U-SP is a menu-driven package (i.e. it is divided into option sequences, for each of which there is a list (the menu) from which options can be chosen). The sequences are arranged in such a way that in each one the options perform related tasks. This allows the package to be structured so that the user works through the main stages of the survey specification and analysis in order. This is illustrated by the logical order of the main option sequences as follows:

MAIN, SETUP, DATA, EXPLORE, OPERATE, ANALYSE

These allow the user, respectively to carry out general maintenance of the surveys, to set up the structure of a survey, to input and edit data, to explore the data through tabulations, to carry out table operations and to analyse the data (in the next release). The package is written in APL though, of course, it is not necessary for the user to be aware of this. It does mean, however, that an APL interpreter, or at least a run-time version of APL (where appropriate) is necessary for running the package.

U-SP has been specifically designed to run on microcomputers (it currently runs on the IBM PC XT, the APRICOT and a number of others, and will run on all machines closely compatible with these) but it is also suitable for multi-user microcomputers, minicomputers and mainframe machines. The size of the survey or census that can be handled is really determined by the machine to be used to handle it, and its storage capacity (though on microcomputers, processing time may become another important consideration). On machines for which it is applicable, U-SP allows multi-user data input, and has the capacity for multiple survey storage.

Experience of the use of U-SP at test sites

U-SP has already undergone extensive testing and, as a result, modifications and further developments have been undertaken. For example, the testing proved especially valuable in modifying and improving the design of the database. There have effectively been three test sites where thorough testing of the package has taken place, in addition to the extensive testing of the package by the project team, and other assistants from ASRU. The ODA in London have been testing the package but, for a package that was initially primarily designed with its use in developing countries in mind, remote testing sites (with, unfortunately, their associated problems) possessing the conditions to be experienced in such usage, are clearly of great value. Two such test sites have been used. The first is the National Planning and Statistics Office within the Government of the Republic of Vanuatu. The second is the Integrated Rural Development Project, Mpika, Zambia.

One of the project team has made two extended visits to the Vanuatu test site, in order to liaise directly with the personnel there over the use of the package, and to help sort out problems arising. Experience with the test sites has been extremely valuable in sorting out practical problems with application of the package to the day-to-day use for which it was intended. In Vanuatu, for example, it has been possible to enter many surveys (about 20) into the package, so that they are simultaneously available within the package and to process the results. Use of the package in Vanuatu has generated a considerable amount of interest in it in that region.

A rewarding feature of the work with the package in Vanuatu is how it has become a powerful tool in the hands of the Statistics Office in helping to formulate Government policy. A real example arose in the formulation of policy for the development of the important local crop, kava. Another interesting recent example of U-SP's use concerns the processing of a survey to assess cyclone damage to the agricultural sector in Vanuatu. The encouraging feature of this was the speed with which it was possible to process the results of the survey with U-SP. Speed was essential in this particular survey carried out in l985 since the results were to be used to assist in the relief food distribution, seed distribution, aid applications for assistance with re-establishing cash crops and other agricultural projects, and to assess the impact of the cyclones on the economy. The questionnaires for the survey were being sent out within 24 hours of the Statistics Office being asked to conduct a survey; within l0 days of the survey being set up about 75% of returns had been received, checked, coded and entered on the computer. Provisional tables were run off almost immediately, and it was possible to give some indications of the likely results to the administrators within 2 weeks of the survey being set up.

The survey set out to cover l50 villages, representing some 22% of the villages in the three districts concerned. At the time of the report, returns had been received from all but two of the villages. The first form in the questionnaire related to the village as a whole and was completed at an initial village meeting. It covered information on damage to copra driers, cattle holdings, cocoa, young coconut plantations and fruit trees. The second form in the questionnaire related to gardens. Five households were selected at random in each village participating in the survey, and the enumerator visited all the gardens for each household. The main objective here was to assess the damage to each of the main food crops, including yam, taro, maniok, kumala and banana, both for those crops that were ready for harvesting and for the crops that had recently been planted and were therefore not yet ready for harvesting. The households were also asked how long they expected their existing food to last and how long they thought it would be before they had an adequate supply of food in their gardens again. A third form related to coconut plantations. The coconut plantations for the same five households as took part in the garden survey were used (unless any of those households did not have a coconut plantation, in which case another household was selected). The enumerator visited each of the coconut plantations.

Although this example is a rather specific one from overseas, the structure of the survey can be thought of in general terms in other contexts, and the examnple does serve to illustrate one type of survey that U-SP can handle quite comfortably.

Discussion

Much of the above account has been set in a rather specialised context, but this is simply because of the way in which the development project was first established. Readers of this Newsletter are likely to be working in different areas and to have their own survey examples which might be in some contrast to the one just described. But the structures of many of the readers' surveys are nevertheless likely to be of a form for which U-SP is quite appropriate, and it is hoped that the package will find use in a variety of contexts and locations.

Package development work such as that described here is just one aspect of the activities of ASRU which was set up in l980. They also include contract work (often data analysis tasks), consultancy, funded research (or, alternatively, student research projects) and statistics training courses (which can be run on-site for organisations outside the University). Of particular interest in the present context is the recent research at Kent which was included examination of the use of imputation methods in surveys and of the use of model-based approaches to inference from sample surveys, involving the concept of the superpopulation model.

Further information about the U-SP package in particular, or ASRU in general, can be obtained from the author.

Philip M. North

Director, Applied Statistics Research Unit Mathematical Institute, University of Kent at Canterbury, Canterbury, Kent, CT2 7NF

Return to Contents page