Wednesday 18 December 2013

Data linkage


Data linkage
1. Data linkage may be performed on an ad-hoc basis to serve only the needs of a single research project. An example of ad hoc data linkage was the Australian Government’s follow-up of mortality and cancer incidence in military personnel who worked in the vicinity of Australian nuclear test sites. A register of the veterans exposed to the test sites was linked as a once-off exercise to the National Death Index and National Cancer Clearing House.
Alternatively, data linkage may be undertaken on a systematic basis with no specific research project in mind at the time that the links are created. Rather, the links are stored and later retrieved to support multiple discrete research projects as the needs arise. An example of systematic data linkage is the WA Data Linkage System.
In ?500 words, identify and explain the advantages and disadvantages of systematic data linkage when compared with ad-hoc data linkage. Write your answer in the space below, which continues on to the next page (10 marks).
2. A researcher has requested an ad-hoc linkage of a file of death records to a file of hospital morbidity records. The researcher says that their aim is to follow the patients from the time of their first hospital admission until the correct date of death (in those who are deceased). The following are the partial identifiers in the two files.
Death records – partial identifiers
File number Social security number Surname First name Sex Date of Birth Postcode
1 711459 Holeman D’Arcy M 17.02.1955 6014
2 498230 Saville Beth F 11.11.1969 6034
3 958940 Holman Gary M 04.05.2001 6074
4 345964 Batty Jenny F 09.10.1980 6093
5 184444 Fosil Emily F 29.02.1906 6009
Hospital morbidity records – partial identifiers
File number Social security number Surname First name Sex Date of Birth Postcode
1 711459 Holman Cashel M 17.02.1955 6014
2 134444 Fossil Emily F 29.02.1906 6009
3 223856 Holman D’Arcy F 04.11.1961 6034
4 334569 Holman James M 01.03.1981 6069
5 334568 Holman Emily F 01.03.1981 6069
6 345928 Batty Gulliver M 01.03.1981 6069
7 223856 Holman D’Arcy M 04.11.1967 6123
8 945872 Batty Ron M 04.06.1950 6069
9 138572 Holman Daisy F 20.02.1920 6009
10 509485 Cooper Clarence M 09.03.1965 6069
11 498230 Saville Elizabeth F 11.11.1969 6034
12 238759 Holman Geraldine F 05.02.1954 6049
13 209523 Battie John M 04.06.1950 6069
14 209523 Batty John M 06.04.1975 6069
15 223856 Holeman Ignatius M 01.03.1961 6009
16 345964 Holman Jenny F 09.10.1980 6093
17 985433 Sydchrome D’Arcy F 05.02.1995 6038
18 295843 Holman Geraldine F 12.12.1935 6046
19 687345 Battie Emily F 17.03.1945 6029
20 333345 Holeman Kathryn F 20.06.1973 6069
In ?500 words, explain how you would go about performing this ad hoc data linkage, using technical methods covered by this course. You should draw on examples from the data shown above, as appropriate, to illustrate your points. Write you answer in the space below, which continues on to the next page (10 marks).
3. There are at least two methods of constructing a measure of perinatal mortality:
Method A: Count the numbers of stillbirths, livebirths, and then the neonatal deaths within 28 days of age, all occurring from 1 January to 31 December within a single calendar year. Divide (stillbirths + neonatal death) by (stillbirths + livebirths).
Method B: Count the number of stillbirths and livebirths occurring from 1 January to 31 December within a single calendar year. Follow up the liveborn neonates and count how many of those neonates die within 28 days of birth. Divide (stillbirths + neonatal death) by (stillbirths + livebirths).
Now answer the following questions:
i. What type of mathematical expression (ratio only, proportion or rate etc) is yielded from
(1 mark)
Method A? ________________________________
Method B? ________________________________

ii. What type of population (cohort, dynamic, or cross-section thereof etc) forms the denominator in:
(1 mark)
Method A? ________________________________
Method B? ________________________________

iii. What type of basic epidemiologic measure (prevalence, cumulative incidence, incidence rate, survival, etc) does the result from method B represent?
(1 mark)
________________________________

iv. Which of methods A and B requires data linkage?
(1 mark)
________________________________
v. Assume that there is some loss to follow-up of livebirths at the time of birth such that only X% of livebirths can then be followed for deaths within the first 28 days. However, follow-up is then complete for all of the X% of livebirths that continue under surveillance. Under these circumstances, explain the epidemiological theory of how would you go about estimating the conditional cumulative incidence of perinatal death (a brief outline using algebraic notation is sufficient)(2 marks).
vi. Now assume that, in addition to a X% loss to follow-up at birth, a further Y% of livebirths become lost to follow-up during the first 28 days. Under these additional circumstances, explain the epidemiological theory of how would you go about estimating the conditional cumulative incidence of perinatal death (a brief outline using algebraic notation is sufficient) (2 marks).
4. You have been given a linked data file that is a merge between births (stillbirths + livebirths) occurring in 2006 and subsequent deaths, provided that the deaths also occurred in 2006. In other words, the length of potential follow-up varies from 364 days for those born on 1 January 2006 down to zero days for those born on 31 December 2006. The linked file contains one record per birth with any deaths appended in the EOR loading area. The file is not in any particular order when given to you. The file contains the following variables:
rootlpno unique personal ID.
confineno unique confinement ID
birthdate date of birth
status 1=livebirth; 2=stillbirth
deathdate date of death (where applicable)
The file contains a number of multiple births, which are denoted by having the same confinement number.
You have been asked to calculate the conditional cumulative incidence ratio, comparing members of multiple birth sets with singleton births. In the space below and overleaf, write out the syntax (in SPSS, SAS or Stata) that you would use to generate the figures from the linked file needed to perform this calculation. Your syntax should include sufficient documentation to enable someone else to follow your reasoning and approach (20 marks). CLICK HERE FOR MORE ON THIS TOPIC

No comments:

Post a Comment