Journal of Quantitative Criminology, Vol. 16, No. 2, 2000
Visualizing Lives: New Pathways for Analyzing Life
Course Trajectories
Michael D. Maltz1 and Jacqueline M. Mullany2
The goal of statistical analysis is to find patterns in data. Most statistical methods
rely on analyzing the effect of the same set of variables on the population under
study, i.e., nomothetic analysis. Therefore, when data are collected in the social
sciences, most often they are put in a framework that resembles a spreadsheet:
each row represents a separate individual, and each column represents a separate
characteristic (or variable) that pertains to that individual.
However, not all individuals in the study are affected by the same set of
variables: each individual may have hisyher own individual set of relevant variables, suggesting that methods be developed that consider them individually, i.e.,
idiographic analysis. Moreover, lives are lived chronologically, and are often best
described in narrative form. These narratives usually have to be condensed, or
abridged in other ways, in order to fit the data framework and permit what
one might call ‘‘algorithmic analysis’’. Each set of methods has its advantage:
nomothetic methods generate general laws that apply to all, while idiographic
methods trace the putative causal relationships that are unique to each
individual.
This paper describes another data collection and analytic framework, one
that (a) is chronological; (b) recognizes that different people may have experienced entirely different events and thus may need different ‘‘variables’’ to understand their behavior; (c) recognizes that, even if people experience similar events,
they may have entirely different reactions to them; and (d) can be studied (and
patterns inferred) using an exploratory graphical analysis that is more free-form
than algorithmic analysis. Examples of this type of analysis used in different
medical and criminal justice contexts are given, and suggested directions of
research in this area are described.
KEY WORDS: longitudinal analysis; graphical analysis; life course; data
visualization; exploratory data analysis.
1. INTRODUCTION
This paper describes an alternative means of collecting, coding, and
analyzing life course data that permits a greater understanding of life course
1
Department of Criminal Justice, University of Illinois at Chicago, 1007 W. Harrison Street
(MyC 141), Chicago, Illinois 60607-7140.
2
School of Public and Environmental Affairs, Indiana University Northwest, 3400 Broadway,
Gary, Indiana 46408.
255
0748-4518y00y0600-0255$18.00y0 2000 Plenum Publishing Corporation
256
Maltz and Mullany
dynamics. We first describe the restrictions that the current array of analytic
tools place on the collection of data, particularly life course data, we then
describe an alternative data collection framework that has fewer such limitations. Subsequent sections describe how data collected using this alternative framework can be depicted and analyzed using visual data analysis
techniques, and provide examples that have been reported in the literature.
Extensions of these techniques are then provided, and the paper closes with
a list of attributes that a set of graphically-based analytic tools would need
to have in order to accommodate the analysis of data using the suggested
data collection framework.
2. LIFE COURSE ANALYSIS
Social science has as its primary goal the understanding of how and
why people behave the way they do. Life course analysis is one of the means
(and from our standpoint, one of the best means) of attaining this understanding, since individuals’ behavior is conditioned on their earlier life
experiences.
Individual behavior can be seen as a sequence of outcomes. Some of
the outcomes may be due primarily to structural conditions in society, while
others may be due primarily to the individual’s initial conditions or subsequent experiences. In criminology, one school of thought (Gottfredson
and Hirschi, 1990) attributes the outcomes largely to the individual’s initial
conditions; another school of thought suggests that initial conditions can
largely be negated by subsequent experiences (e.g., Nagin and Paternoster,
1991; Sampson and Laub, 1993); and a third (e.g., Moffitt, 1993; Nagin,
1999) suggests that both schools are correct to some extent; i.e., that some
individuals cannot change their behaviors easily while others can.
Regardless of the school of thought, similar techniques are used to
investigate the relationship between behaviors and their correlates. A group
of individuals is studied, a set of variables is specified, and a set of data is
collected. Although we know that different things happen to different people over their life course, in order to analyze patterns we collect the same
data for everyone. The reason for this is supposedly because we want to
have a level playing field, to examine the characteristics of each individual
on the basis of the same criteria. We suggest that this is not really the reason
(or is not the sole reason), and that new tools need to be developed to
examine individuals’ lives in better ways.
For example, we know that in many cases the same things happen to
different people, but each may have a different reaction to the same stimulus
(Farrington, 1993; McCord, 1990, 1993; Maltz, 1994). That is, a divorce
may adversely affect one family, while it frees another family from the
Trajectories
257
stresses of an abusive parent and spouse. In the first case the family (or
some of its members) may become dysfunctional, in the second case they
(or some of them) may thrive. Or a move to a different neighborhood may
improve one youth’s life prospects but adversely affect another’s. Yet patterns as complex as these may become masked by virtue of the analytic tools
that are normally brought to bear on the data. This may be due in large
part to the way data are collected so that these analytic tools can be applied,
and on the types of data collected for analysis.
3. THE STANDARD DATA COLLECTION FRAMEWORK
The standard framework for data collection in the social sciences is
predicated on the use of hypothesis tests. The research question is usually
cast as an inquiry into the effect of a specific variable (or set of variables)
on an outcome (or set of outcomes). If the magnitude of the effect is sufficiently large that it is unlikely to be attributable to chance, then the effect is
said to exist, to be statistically significant.3
But hypothesis tests are rather crude analytic tools (Loftus, 1993;
Maltz, 1994). In fact, consider the data framework in which these tests are
applied. In general, a data base is analyzed; that data base contains a list
of variables (the columns) and a number of ‘‘observations’’ of those variables, where each observation (i.e., each row) represents the value of each
variable that obtains to each separate individual.
The term ‘‘observation’’ comes from the natural sciences, wherein outcomes of an experiment are observed and tabulated. Each observation may
represent a replication of the experiment under both similar and different
conditions, in order to determine the extent of variation in the outcomes
and how they are affected by different conditions. For example, different
observations of the pressure, temperature, and volume of a gas will lead a
researcher to Boyle’s Law. And Fisher, in The Design of Experiments (1935),
observed the yield of different strains of barley seed in different years and
at different sites to determine the relationship linking yield to site and seed
type.
But observations are different in the social sciences. Observing people
is not like observing gases or crop yield; the situation with respect to data
collection in the social sciences is much more reflexive than in a physics or
agricultural experiment (e.g., Briggs, 1986). Moreover, use of this term in
the social sciences implies that there is only one mode of behavior, leading to
3
Statistical significance only make sense when the data represent a random sample, but the use
of a random sample seems to be honored more in the breach than the observance (Maltz and
Zawitz, 1998).
258
Maltz and Mullany
a single ‘‘law’’, a single relationship between the dependent and independent
variables—yet we know that people often react quite differently to the same
stimulus.4 And even when many individuals react similarly to the same
stimulus, their reasons for doing so may be quite different from one another
(Kagan, 1998: 76).
Although social science researchers recognize that different things happen to different people (which would almost seem to require a different set
of variables for every individual in the data set), the number of variables is
limited by the analyst: if the number of variables is not limited, the data set
becomes so sparsely populated and its analysis so unwieldy and idiosyncratic that the analysis may not come up with (statistically significant) findings of any utility whatever. Thus the collection of data is usually restricted
to only those variables that are common to most of the individuals under
study, so that a sufficiently large number of individuals’ records can be
analyzed to provide statistically significant findings. This procedure has
been used to great benefit in the social sciences, to uncover nomothetic
relationships (i.e., general laws; nomos is Greek for law) among variables
(Maxfield and Babbie, 1998: 49). However, it is also limiting; such studies
may just ‘‘round up the usual variables’’ (Maltz, 1994: 451). In other words,
the need for sufficient data restricts the view—and, we would suggest, the
vision—of the researchers. In fact, the rectangular shape of most data sets
(Fig. 1) can be seen as a confinement of sorts—our assessment of people’s
lives is based only on the variables they hold in common, ignoring the very
real (and important) individual differences they may have that do not fit in
the restricted data frame.
This situation exists in part because of the way the data are analyzed.
As described earlier, standard analyses test hypotheses about the effect of
variables on outcomes of interest. This is a variable-based analysis, in which
the effect of the variable is assumed a priori to be univalent—that is, a
weight is attached to that variable that is either positive or negative,
implying that the variable either increases or decreases the outcome. However, as we noted earlier, effects are not that straightforward in studying
human behavior.
Others (e.g., Magnussen, 1990; Kagan, 1997) have written about the
advantages of person-based analyses over variable-based analyses in developmental research. Person-based analyses focus on developmental
sequences and chronology, which can show the effect of not just the variables but how they are sequenced over time. It is possible to employ the
4
For example, when considering the effect of divorce or of residential mobility on youths, a
single coefficient is obtained for each variable, masking the fact that some are removed from
risky situations and some are put at greater risk.
Trajectories
259
Fig. 1. A Standard Data Collection Framework in SPSS.
standard data framework to investigate developmental and chronological
aspects of life courses; for example, Nagin (1999) describes an innovative
procedure developed to deal with time-sequenced data. This method is
predicated on the same data elements being collected repeatedly from each
individual in the study, again in the search for general laws of human
behavior. But even person-based analyses may fall short of their potential
if consideration is not given to:
• permitting variables to have multivalent effects,
• lifting the restriction on the number of variables on which data are
collected, and
• retaining the narrative aspects of the collected data.
A person-based analysis that considers these factors requires a different
framework for collecting data. We propose the following alternative data
collection frame that takes these factors into consideration.
4. AN ALTERNATIVE DATA COLLECTION FRAMEWORK
When people fill out forms or answer structured questionnaires, they
provide information on variables that have been selected according to the
260
Maltz and Mullany
criterion mentioned earlier—variables that are relevant to a reasonably large
number of subjects. They would ordinarily not include all of the factors or
incidents that were important in each individual’s life, because so many of
them may in fact be idiosyncratic. Yet even in structured interviews people
may give narratives and tell stories, with all the ‘‘thick description’’ (Geertz,
1973: 6) and contextualization that that implies. Much of this information
may be lost, because it does not fit into the restrictive data frame based on
common information.
This need not be the case. One can envision an alternative data framework that augments common information with subject-defined variables
based on the narratives given by the subjects. It is not suggested that all
such information will be useful, or that the subjects all know what the salient events in their lives were, or that they will trust the interviewers sufficiently to provide such information; however, insofar as such information is
obtained, one can develop an alternative data collection framework that
would use subject-defined variables based on these narratives.
The narratives will, of course, all be different. Moreover, the number
of different events that occur to different people may vary considerably:
Mike may have led a relatively placid existence while Jackie’s life may have
a great number of significant turning points.
The question then becomes how to achieve the goal described earlier:
to obtain an understanding of how people are affected by their experiences.
We suggest that this alternative data collection framework can be used to
draw individual time line trajectories showing life course events, based on
the narratives. Events and processes would be arrayed on the time line, a
separate time line for each individual in the study. This has a number of
advantages: First, it portrays events in their chronological sequences, which
often suggests possible causal connections. Second, with each person’s time
line(s) being portrayed individually, the fact that some lives are more complex than others can easily be seen and grasped. Third, it not only shows
the sequencing of events (which event came first) but when (the date) the
events occurred, which may be important from three standpoints:
• developmental: child abuse takes on a different importance if it
occurred at age 3 or at age 15;
• historical: a person’s unemployment takes on a different meaning if
it was during a period of prosperity or during an economic downturn; and
• cohort: those born in the 1930’s and experienced unemployment at
age 20 will have different trajectories than those born in the 1940’s
and experienced unemployment at age 20.
Thus, it may be possible to distinguish age and period and cohort effects in
this kind of framework. Moreover, it is much easier to deal with the effect of
Trajectories
261
event sequencing in this framework than using standard analytic methods.
The framework within which these kinds of data are collected is quite
different from the standard data collection framework. Rather than there
being a column for every variable and row for every respondent (Fig. 1), it
includes a separate page of data (as in a spreadsheet) or table of data (as in
a database) for every subject. The page has a row for every relevant event
or process (e.g., date of birth, school, a new job). The first column contains
the dates of the events or processes, and subsequent columns contain
descriptors of the events and assessments of their impact; an example is
shown in Table I. The number of entries for an individual depends upon
the amount of activity the individual generated over hisyher life course, and
not upon the number of variables selected by the researcher as being important. [In fact, sufficient information about these events may already exist in
the data collected as part of a study, but may not be used as coders try to
fit the rich information into the restrictive standard data collection
framework.]
This type of analysis is termed idiographic analysis (Maxfield and
Babbie, 1998: 49), since it looks at each individual’s trajectory separately to
infer causal relationships rather than across all individuals. The problem
with idiographic analyses, however, is that it is difficult to generalize from
them. Many believe that nomothetic methods are the best way to gain an
understanding of a process, and cite Meehl (1954) as justification; they interpret his primary finding to be ‘‘statistical prediction trumps clinical
prediction.’’
However, as he noted (Meehl, 1973: 83), there are times when we
‘‘should use our heads instead of the formula’’, when the algorithm is not
as good as the brain. Granted, when the task at hand is applying a known
algorithm to a set of data in the service of prediction or classification, the
human brain cannot compete with the computer—or, more generally, with
a mechanical application of the algorithm. But when the algorithm is
unknown, the tables are turned (Richters, 1997: 224). We suggest that this
may be the case in life course analysis, and that computer-based visual techniques are more suited to the analysis of life course trajectories than statistical algorithms.
That is, computers can be used to do more than just compute. They
can be used to organize data in ways that make it easier to discern patterns
visually; i.e., by using them as ‘‘power steering’’ devices, where the driver
decides on the route, rather than as ‘‘autopilots’’, where the decision-making
is taken from the driver’s hands (Maltz, et al., 1991: 46). The next section
describes different techniques currently used to analyze data visually.
Instead of an approach that is geared toward algorithmic analysis and testing hypotheses, this framework is geared toward exploratory data analysis
262
Maltz and Mullany
Table I. Example of a Chronological Data Collection Framework: Partial Listing of
‘‘Megan’s’’ Domains and Events
Dates
DOB
05y04y61 0.25
08y24y95
05y04y66
03y02y79
03y02y85
03y02y89
03y01y49
04y01y50
05y01y53
06y01y55
04y01y56
07y01y57
08y01y58
09y01y62
05y01y64
06y01y66
07y01y68
05y01y78
05y01y70
05y04y61
05y02y66
03y02y79
03y02y85
03y02y89
04y02y95
09y01y68
06y07y70
09y01y70
06y07y73
09y01y73
10y04y74
10y05y74
06y07y75
06y25y75
09y01y75
12y15y78
09y01y78
03y01y74
12y01y78
03y01y83
02y01y82
08y01y82
Siblings’
DOB Moves
Grade
schools
Relationships
Childrens’
DOB
High
schools
Date of
probation
4.25
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
0.5
1
1
1
1
1
1
1.25
1.25
1.25
1.25
1.25
1.25
1.25
1.25
2
2.25
2.25
2
1.75
1.75
2
1.75
1.75
Numbers in the cells represent the y-values of the graphed variables. For example, ‘‘Megan’s’’
first relationship was between 1974 and 1978, and this is represented in Fig. 8 as a line 1.75
units above the x-axis.
Trajectories
(Tukey, 1977), specifically graphical analysis, and the generation
hypotheses.
263
of
5. VISUAL DATA ANALYSIS
We are accustomed to using computer algorithms to analyze data, to
provide us with specific answers (i.e., whether a variable is or is not statistically significant). But data can be analyzed visually as well, to infer patterns
of a different sort.
Graphical analysis has its roots in Playfair’s Political Atlas, published
in 1786 (Wainer, 1997). His techniques were brought into the (early) computer age by Tukey in his classic book, Exploratory Data Analysis (1977).
Tukey distinguished between confirmatory data analysis—of which significance testing is a primary example—and exploratory data analysis, in particular, the depiction of data in a way that ‘‘forces us to notice what we
never expected to see’’ (p. vi, emphases in the original). Although his book
is somewhat dated—he devotes a great deal of attention to pencil-and-paper
methods, many of which have become superfluous since the widespread use
of computers—it contains a great number of examples of the use of graphical methods to find patterns in data.
Six years later, Tufte approached the subject from the other side.
Instead of looking at how data might be graphed, he concerned himself with
how graphs of data convey information. In The Visual Display of Quantitative Information (1983), he criticized the way data were displayed in graphs;
he coined the term ‘‘chartjunk’’ to refer to graphs that used kitschy symbols
to draw attention to them, but that were basically ‘‘data-thin’’. His two
subsequent books (Tufte, 1990, 1997) reinforced this concern and provide
additional examples of graphing techniques that enhance the understanding
of data.
Cleveland moved the field of graphical data analysis a few steps further.
In his books, Visualizing Data (1993) and The Elements of Graphing Data
(1994), he provided detailed examples of how data might be explored visually as an analytic tool. The use of some of these techniques can be found
in Maltz (1998).
Cleveland (1993: 328) provides a telling example that shows the benefit
of graphical analysis over standard statistical analysis. He used a data set
that was first analyzed by Fisher in the 1930s (on the yield of different
strains of barley in different sites in 1931 and 1932) and subsequently found
its way into a number of statistics texts. However, none of the earlier statisticians actually plotted the data until Cleveland did so—and he found that
someone had mistakenly switched the 1931 and 1932 data for one of the
264
Maltz and Mullany
sites! Although we do not expect to uncover similar problems with life history data, it does point out the benefit of actually looking at the data before
analyzing them with a statistical algorithm.
Two general strategies have been used for depicting life course trajectories, depending on the number of variables depicted in each trajectory.
When there are only a few variables, many trajectories can be plotted simultaneously; however, when many variables are to be plotted (in the examples
given below, more than three), only one trajectory can be plotted in a single
chart.
5.1. Few Variables
Three examples of trajectory depiction are given. Two are from the
medical literature and are based on survival analysis, the third describes
family dynamics.
Goldman (1992) uses ‘‘eventcharts’’ to display survival data; the data
were taken from a bone marrow transplant database to study the prevention
of cytomegalous virus infection. In Fig. 2 the vertical axis represents data
of an intervention, in this case a bone marrow transplant. The further up
the vertical axis the line starts, the more recently the transplant took place.
The horizontal axis represents time since intervention; the length of each
line represents the length of time each individual was observed (or stayed
alive—the vertical bar at the end of a horizontal line represents time of
death). To give some perspective to the time available for observation, a
‘‘now’’ line (the diagonal dashed line) indicates the extent to which survival
data are available—those who entered the treatment program later in time
have a shorter maximum follow-up time. Markers on the line represent
characteristics of the patient’s medical history; the open circles represent
when Graft-vs.-Host Disease (GvHD) occurred. Goldman (1992: 14) notes
that ‘‘GvHD occurs early and although a serious complication, is seldom
followed by relapse or death’’; as can be seen in the figure, most of the
trajectories with open circles traverse the chart all the way to the now line,
indicating that they did not die, at least during the time they were observed.
The closed squares in the figure represent time of relapse.
Lee et al. (2000) extend Goldman’s concept and display a number of
different ways of representing such data. Using a different data set (patients
with head and neck cancer), they give examples (Fig. 3) of displaying life
course trajectories of patients, but rearrange them (‘‘stack’’ them) according
to (a) the patient ID, (b) their date of registration (with a ‘‘now’’ line), (c)
their total exposure time, and (d) the presence (dashed line) or absence (dotted line) of an important indicator, the p53 protein. The open circles represent the date of registration in the program (start time) and the open
Trajectories
265
Fig. 2. Eventchart of Bone Marrow Transplantation Data. Lines represent patients’ event
records, with open circles showing time of GvHD, black squares showing time of relapse,
vertical lines showing time of death, and the diagonal line representing the ‘‘now line’’ (the
limit to time available for observation).
Source: Goldman (1992). Reprinted with permission from The American Statistician. Copyright 1992 by the American Statistical Association. All rights reserved.
squares represent the date of last follow-up. Note that, unlike the data in
the previous figure, some people were lost to follow-up before the end of
the observation period—they may have moved or even died (from some
other cause, like a traffic accident). An X represents time of death, a filled
triangle the date of recurrence, and a filled circle the date of second primary
tumor. The software that generates this type of display, ‘‘event.chart’’, is
written in the language S-Plus and archived at http:yylib.stat.cmu.eduySy.
Another type of life course trajectory is based on the Lexis plot, named
after its creator, the 19th century German demographer Wilhelm Lexis.
Francis and Fuller (1996) developed software (also written in S-Plus) to plot
the life course trajectories of individuals. Examples can be found at the
URL http:yywww.cas.lancs.ac.ukyalcdyvisualy. A figure at this website
depicts ‘‘Lexis pencils’’ showing the employment history of 188 married
couples in Kirkcaldy, Scotland, as a function of the presence of children of
266
Maltz and Mullany
Fig. 3. Event Charts showing different ways of displaying Survival Data.
Source: Lee et al. (2000).
Trajectories
267
Fig. 4. Four ‘‘Lexis Pencils’’ showing the relationship among Male Employment, Female
Employment, and Age of the Youngest Child.
varying ages. Figure 4 is a black-and-white perspective representation of
one of their color figures, showing only four of the 188 trajectories.5
The trajectories are canted at 45 degrees from the horizontal, and the
base plane is ‘‘anchored’’ according to the woman’s age at marriage. The
(Z) axis coming toward the reader is the year of marriage, the (X) axis
moving to the right is age of the woman, and the vertical (Y) axis is time
since marriage. For example, the trajectory ‘‘closest’’ to the reader is of a
couple who married in 1969 when the woman was about 19 years of age.
Five years later (moving vertically five years) she was 23 years of age, so
the trajectory has moved diagonally, up five years and to the right five years.
Each trajectory has three facets, like the three visible faces of a hexagonal pencil, which is why the trajectories are called ‘‘Lexis pencils’’. The
upper facet represents the employment history of the husband and the
5
Since color is essential to understanding this figure, the reader is encouraged to access the
cited website.
268
Maltz and Mullany
middle facet the employment history of the wife: light shading represents
working, dark shading unemployed. The third (lowest) facet represents the
age of the youngest child; the first change in color (shading) occurs upon
that child’s birth. Note that the woman in the closest trajectory stopped
working after about a year of marriage (middle facet), followed soon afterwards by the birth of their first child (bottom facet).
The full figure, not included here (it can be found at the cited URL),
shows how the employment of women has changed over the forty years
represented in the figure, and how more women returned to work (and
returned earlier) after giving birth in the more recent years. That figure can
also be rotated to show different perspectives on the data. Although this
finding is hardly surprising, such age, period, and cohort behavior cannot
be characterized easily using common statistical methods.
These three examples show that it is possible to plot many trajectories
in a single plot when only a few variables are included inyon the trajectories.
Both symbols and line weights can be used to portray variables, depending
on whether they represent events or long-term processes. Trajectories can
be stacked in different ways, which can provide insight into relationships.
In addition, Fig. 4 shows that different domains can be portrayed in a single
trajectory by including different ‘‘faces’’ on the trajectory line.
5.2. Many Variables
Three examples of trajectories with many variables are given. Two are
from the social sciences and one is from the medical literature.
Cohen (1999) shows how a number of variables representing different
domains of an individual’s life can be portrayed in a single figure (Fig.
5). Each panel represents a different individual, and the figure depicts the
(smoothed) extent—0 to 100%—to which each individual has made the transition to adulthood and independence in the depicted domains. The figure
shows how such variables can be portrayed over the life course, permitting
comparisons of the individuals’ development and permitting inferences as
to how the variables interact in different individuals.
Post et al., (1988; see also Leverich and Post, 1993) use graphical techniques to analyze life course data of patients with affective disorder (Fig.
6). The authors point to the benefits of using graphic representation of the
life course of patients, such as more accurate tracking and the ability to
identify which combination of factors (medication, treatment, contextual
factors, etc.) has been effective in dealing with the disorder throughout the
life of a patient.
There are three levels of annotation on the graph. The relevant events
that occurred during the patient’s life course are listed beneath the time line
Trajectories
269
Fig. 5. Lowess-Smoothed Paths of Four Individuals Making the Transition between Childhood and Adulthood.
Source: Cohen (1999).
270
Maltz and Mullany
Fig. 6. Life Chart of the Course of Illness of a Manic-Depressive Woman.
Source: Post, Roy-Byrne, and Uhde (1988).
(which is segmented to permit it to fit in the journal article); the symptoms
are arrayed on the line, manic episodes above the line and depressive episodes below the line; and the levels and durations of the treatments are given
above the line. Although the figure appears complicated, it has been found
useful in organizing the information about life course events, symptoms that
may have been produced by these events, and treatments that may (or may
not) have ameliorative effects on the symptoms. Thus, the data are
organized in a way that preserves the chronological order of the events,
permitting tentative hypotheses to be generated about events that may have
triggered episodes and the effectiveness of different treatments.
An initial attempt to plot a criminal justice life course trajectory similar
to that in Fig. 6 is shown in Fig. 7 (from Maltz, 1995). This figure is based
on information obtained from one source, narratives written by Cook
County (Illinois) juvenile probation officers in the case jackets of their
charges. It uses rather simple symbols to depict different events in a youth’s
Trajectories
271
Fig. 7. A single youth’s Juvenile Record, based only on Data from Juvenile Court.
Source: Maltz (1995).
life. In places where symbols were inadequate to describe the events, a text
box was used to convey the necessary information. While this figure is
admittedly crude—it would benefit from better symbols andyor icons, varying line weights, color, and other graphical embellishments—it depicts patterns in a way that would not be possible with the methods and data that
are normally used in studying delinquency. Little detail is shown in Fig. 7,
because it shows data only from a single source of information, Cook
County Juvenile Court records.
6. EXTENSIONS OF THESE METHODS
As Henry (1999) noted, there appears to be a tradeoff in these
trajectory depictions between being extensive (in terms of number of
subjects depicted) and intensive (in terms of number of variables for each
subject). This may appear to be a limitation to the use of graphical
methods; however, we think that one can overcome this limitation to a
great extent. One can ‘‘squeeze down’’ the portrayal of complex individual
trajectories so that many of them can be stacked on the same figure or
272
Maltz and Mullany
computer screen, to permit a search for patterns and commonalities among
the different trajectories. This is the direction we have been moving toward
in this research.
Our initial attempts to develop graphical data displays are presented.
It should be noted that these are preliminary steps in developing a system
for presenting life course data. One of the strongest differences between our
explorations and those already presented is the number of different event
types we portray, related to different domains of the subject’s life. Our ultimate goal is to squeeze these multiple-trajectory life course pictures down
to portray all information on a single trajectory (perhaps with more than
one facet, as in Fig. 4), so that many trajectories can be depicted and compared on the same chart or computer screen. In this way we hope to be able
to spot patterns in the subjects’ lives, patterns that are undetectable using
standard variable-based techniques.
For example, Klosak (1999) uses different icons, colors, and other
graphical symbols in depicting life course data from field notes obtained
through personal interviews with 15 adult women on probation in Cook
County (Chicago) Illinois. Her analysis involves the development of ways
of portraying life course trajectories of each woman through graphical
presentations of all the relevant domains of their lives.6 Figure 8 is based
on a life course trajectory depicted in color in Klosak (1999). This means
of presenting the data allowed all of the variables to be seen at one time
and highlighted the sets of factors, and their time order, that influence a
woman’s life choices.
Note the many different icons used, with different levels representing
different domains of activity and different colors (in this case, shades)
representing varying impacts of these events (favorable or unfavorable). For
example, educational achievement is represented by a scroll: positive aspects
are represented by white scrolls and negative aspects (e.g., moving from
one school to another in the middle of the school year, dropping out) are
represented by gray scrolls. Good relationships are depicted with normal
white hearts, abusive relationships with inverted gray hearts. The domains
represented in this graph include family, education, employment, relationships, children, alcohol, and drugs, and criminal justice involvement.
Figure 9 uses the same set of icons as Fig. 8. ‘‘Christy’s’’ present status
appears to be reflective of activities outlined on the drug and family trajectories. She began taking drugs at an early age (seven); both parents used
6
Since the events are based on interviews, it cannot be expected that the interviewees (a) remember and (b) report all events in all relevant domains of their lives. In other words, the extent
of the data portrayed depends on the memory and truthfulness of the respondents, their
understanding of the nature of the inquiry, and their ability to respond appropriately to it.
Trajectories
273
Fig. 8. Life Course Events of a Female Probationer Code-Named ‘‘Megan’’.
Source: Klosak (1999).
drugs. At various stages of her life, she experienced losses of those close to
her. As indicated by the black down arrows, she has had a lot of stress in
her life; her parents separated when she was an infant. her grandmother
died when she was a young girl, her mother was hospitalized in March 1987,
her brother was sentenced to jail in May 1987, her sister was killed in June
1987, and her mother died in July 1992. Further review of her chart also
shows that since she has been on probation, she has demonstrated signs of
progress. She has obtained her first job and, perhaps most importantly, she
has refrained from further substance abuse.
Analyzing the data in this manner can reveal the contributing factors
associated when a client is either heavily involved or least involved in criminal activity. In this way, one can better understand the interrelations of the
different aspects of their lives and how these aspects may have affected the
decisions that ultimately led to their current criminal status. Although the
figures are fairly complicated, this is not necessarily a problem. As Tufte
274
Maltz and Mullany
Fig. 9. Life course events of a female probationer code-named ‘‘Christy.’’
Source: Klosak (1999).
(1990: 37) says, ‘‘simplicity of reading derives from the context of detailed
and complex information, properly arranged. A most unconventional design
strategy is revealed: to clarify, add detail’’ (emphasis in the original).
These two life courses can be compared by confining their respective
trajectories to a thin horizontal strip and ‘‘stacking’’ them on the same page.
This is shown in Fig. 10. An additional trajectory is included for comparison, one depicting an individual who experienced what one might consider
‘‘ideal’’ life course events. Code-named ‘‘Barbie’’, this trajectory includes
grade school, middle school, high school, and college; few moves; use of
marijuana during college; a few relationships, the last resulting in marriage
and two children; and employment after college (including a change of
employers) and until the first birth.
Macroscopically, we can contrast the relatively spare trajectory of Barbie with the rather congested trajectories of Megan and Christy. Moreover,
the nature of the events is considerably different, suggesting that an analysis
Trajectories
275
Fig. 10. Life course events of ‘‘Megan,’’ ‘‘Christy,’’ and ‘‘Barbie,’’ compressed on the same
page.
using the same variables for each of the subjects might miss important
phenomena in each person’s life.
We recognize that these figures are inadequate; they should be taken
more as representative of what might be done than as a finished product.
First, the figures are in black and white, since this journal does not print in
color. Second, the figures were created by a spreadsheet program (Excel)
that has limited graphics capability and a limited set of ready-made symbols—some of the symbols were drawn by the authors. Third, the symbols
that represent different variables are not intuitive, and there is no control
over which ones are on top when they overlap. Fourth, long-term processes
(such as parenting or education), which might be represented by lines or
horizontal bars whose thickness andyor color change over time, could not
be represented in this way in Excel.
Our expectation is that a mature graphical analysis tool, one more fully
developed than this, would produce a number of benefits.
276
Maltz and Mullany
• After the initial interview, the chart could be shown to interviewees
to give them an idea of how their own lives have been shaped, and
might elicit more information from them about other salient events
and processes in their lives.
• By putting all (or much) of the information in a readily accessible
form, it would also help the analyst in understanding all of the many
events that shape a person’s life.
• In encapsulating events from so many domains of a person’s life, it
may assist in ferreting out hidden patterns, for example, whether the
nature of stressful events is more important than their intensity, their
number or their timing.
• It could be of benefit in developing new categories of individuals, in
terms of behavioral responses that are not strictly linear and amenable to existing algorithms, and may also help in the development of
new algorithms to study such behaviors. In fact, a visual display of
data ‘‘offers us a way to do perceptual clustering’’ (Wilkinson,
1999: 111).
In other words, we believe that the figures do show the promise
inherent in graphical representation of life courses. They also suggest the
directions that must be taken to develop an analytic tool that can be used
to understand life course trajectories.
7. REQUIREMENTS FOR A GRAPHICAL TOOL FOR LIFE
COURSE ANALYSIS
This brief review of the pros and cons of the various means of depicting
trajectories suggests the following desiderata for a new set of graphical
analysis tools:
• The data frame used to collect and store the data should have a
separate page or table for each individual. Each pageytable should
have separate rows for each event or process, with as many columns
as needed to describe that individual’s events. The events themselves
may have a number of descriptors associated with them (e.g.,
location, color, size, etc.) that can be detailed in different columns.
• There should be a fairly large repertoire of symbols to represent different phenomena. These symbols should be readily interpretable;
i.e., they should bring to mind the type of events they represent.
Standardization makes interpretation easier; for example, in
regression analyses we interpret the xi as independent variables and
the β i as coefficients, and would be confused were we to read a paper
that reversed their meaning. The tools should be sufficiently versatile
Trajectories
•
•
•
•
•
•
•
277
to allow an investigator to create new symbols as needed and incorporate them into the analysis.
Since the events represented by the symbols may have different
meanings for different subjects, they should be distinguished by color
and size as to their importance and valence, i.e., whether they have
a positive or negative effect—in fact, graphing permits one to rethink
causal relationships and determine the effect of different events
retrospectively.
Not only should events be depicted, but processes as well. For
example, educational attainment is not just a static outcome but a
process that may be affected by family dynamics, residential
mobility, peer group involvement, or other events in an individual’s
life. To depict such processes, the lines themselves should take on
different colors, thicknesses, and textures, and it should be possible
to vary these characteristics over the life course. That is, if an individual begins to fall behind in school, the line representing educational
attainment may change color or thickness.
Different domains of a person’s life should be represented on the
same graph. Rather than doing so by separating the domains vertically (as in Fig. 8), they should be packed together (as in the ‘‘Lexis
pencils’’ of Fig. 4), in order to permit the comparison of many trajectories simultaneously.
It should be possible to stack different trajectories in the same graph,
and use different criteria to determine the order of the stacking (as
in Fig. 3).
It should be possible to change the origin of the horizontal axis, say,
from calendar time to time since birth or time since some other event
(marriage, as with Francis and Fuller, 1996, or birthdate, or release
from custody, etc.).
Since some events are fairly complicated, it should be possible to
describe their characteristics by attaching a text box to the symbols
representing them, that can be opened by clicking on them (or by
passing the cursor over them). This, therefore, implies that the
methodology should be dynamic, i.e., should be interactive. That is,
not only should we go beyond numbers, tables, and text to present
(and represent) data, but we should go beyond paper as well.
It should be possible to magnify or ‘‘explode’’ the trajectories either
horizontally (to inspect a sequence of closely-packed events) or vertically (to inspect the different life course domains individually). This
also implies an interactive user interface.
Although it was not known at the time this paper was originally written, a product that includes virtually all of these features is now available.
278
Maltz and Mullany
Wilkinson (1999) and his colleagues have developed a software package,
Graphics Production Language, in Java that can be used to depict processes
and events on individual trajectories. We have not employed it in depicting
and analyzing trajectories, but feel that it may prove to be a very useful
adjunct to the person-based, idiographic analytic approach described in this
paper.
7. CONCLUSION
The goal of statistical analysis is to find patterns in data. Most statistical methods rely on analyzing the effect of the same set of variables on the
population under study, i.e., nomothetic analysis. However, not all individuals in the study are affected by the same set of variables: each individual
may have hisyher own individual set of relevant variables, suggesting that
methods be developed that consider them individually, i.e., idiographic
analysis. Each set of methods has its advantage: nomothetic methods generate general laws that apply to all, while idiographic methods trace the putative causal relationships that are unique to each individual. Our goal has
been to explore how (or whether) the best features of these two different
ways of looking at data can be combined.
This paper provides a rationale for and some desired characteristics of
a new methodology for analyzing data in non-traditional ways. In particular, the new methodology focuses on suggested requirements for analyzing
life course data, in an attempt to combine nomothetic and idiographic
methods. It may be argued that some lives are so complicated that the
amount of detail would swamp anyone’s attempt to depict those trajectories,
and that some measure of selective editing needs to be done, but it is premature to make such a judgment. As a counterexample, consider the
amount of information conveyed by a modern map, showing, in different
colors and overlays, the topography, location, and approximate size of cities, jurisdictional boundaries, altitude, points of interest, land use, and other
useful information as well.
Old coding procedures seem to squeeze the juice out of personal histories in an effort to collect comparable data. One has the feeling, even after
reading some prize-winning analyses of life courses, that the researchers
wished that there were other, more appropriate ways of handling individual
life histories and finding patterns, but they were constrained by a set of
algorithms that did not fit the type of data they had. They did the best they
could, but in the end had to distill their data to fit into the strait-jacket of
the rectangular data frame.
It may be said that the methods outlined herein are somewhat crude
and limited. It was suggested that using graphical techniques to analyze and
Trajectories
279
compare more than 20 trajectories may not be possible. This may in fact be
the case; however, once these techniques are in use, it may well be that
newer methods of analyzing the data are developed, in much the same way
that LISREL and HLM and other statistical methods have been added to
the repertoire of the social scientist.
This is more than just a new way of collecting data; one has to reconceptualize from collecting ‘‘data’’ to collecting stories. It also means that we
have to unlearn some of the strictly ingrained (variable-based) habits of data
analysis, and to permit considerationof different ways of finding patterns in
data. In sum, we feel that this exploration into graphical analysis of life
course data is an approach well worth the effort. We have just begun to
scratch the surface in moving toward the realization of Wild’s (1994, p. 168)
prediction, that in the future ‘‘the primary language for promoting human
understanding of data will be sophisticated computer graphics rather than
mathematics.’’
ACKNOWLEDGMENTS
The research for this paper was supported by Grant 95-BJ-CX-0001
from the Bureau of Justice Statistics, U.S. Department of Justice (Visiting
Fellowship, ‘‘Development of Graphical and Geographical Methods of
Analyzing Data’’), as well as by a sabbatical from the University of Illinois
at Chicago, to the first author. The opinions expressed herein are those of
the authors and do not necessarily represent the official position or policies
of the U.S. Department of Justice. A version of this paper was presented at
the Life History Research Society Conference, September 25, 1999, Kauai,
Hawaii. The authors thank John Laub, Mindie Lazarus-Black, Joan
McCord, Daniel Nagin, John Richters, Robert Sampson, and Leland Wilkinson for their comments on an earlier draft.
REFERENCES
Briggs, C. L. (1986). Learning How to Ask: A Sociolinguistic Appraisal of the Role of the
Interview in Social Science Research, Cambridge University Press, Cambridge, England.
Cleveland, W. S. (1993). Visualizing Data, Hobart Press, Summit, NJ.
Cleveland, W. S. (1994). The Elements of Graphing Data, Hobart Press, Summit, NJ.
Cohen, P. (1999). Presentation materials prepared for the Life History Research Society Conference, Kauai, Hawaii, September 22–25, 1999.
Farrington, D. P. (1993). Interactions between individual and contextual factors in the development of offending. In Rutter, M. D. (ed.), Studies in Psychosocial Risk: The Power of
Longitudinal Data, Cambridge University Press, Cambridge, England.
Fisher, R. A. (1935). The Design of Experiments, Oliver and Boyd, Edinburgh, Scotland.
Francis, B., and Fuller, M. (1996). Visualization of Event Histories. J. R. Statist. Soc. A,
199(2): 301–308.
280
Maltz and Mullany
Geertz, C. (1973). The Interpretation of Culture, Basic Books, Inc., New York, NY.
Goldman, A. I. (1992). Eventcharts: Visualizing survival and other timed-events data. Am.
Statist., 46(1): 13–18.
Gottfredson, M. R., and Hirschi, T. (1990). A General Theory of Crime, Stanford University
Press, Stanford, CA.
Henry, D. (1999). Presentation at the Methods Workshop, Life History Research Society Conference, Kauai, Hawaii, September 23, 1999.
Kagan, J. (1997). Conceptualizing psychopathology: The importance of developmental profiles.
Dev. Psychopath. 9: 321–334.
Kagan, J. (1998). Three Seductive Ideas. Harvard University Press, Cambridge, MA.
Klosak (Mullany), J. (1999). The Course of their Lives: Women Offenders on Probation. Unpublished Ph.D. dissertation in Public Policy Analysis, University of Illinois at Chicago.
Lee, J. J., Hess, K. R., and Dubin, J. A. (2000), Extensions and applications of event charts.
Am. Stat. 54: 63–70.
Leverich, G. S., and Post, R. M. (1993). The NIMH Life Chart Manual for Recurrent Affective
Disease: the LCM. Biological Psychiatry Branch Monograph, National Institute of
Mental Health.
Loftus, G. R. (1993). A picture is worth a thousand p-values: On the irrelevance of hypothesis
testing in the microcomputer age. Behav. Res. Methods, Instrum., and Computers 25(2):
250–256.
Magnusson, D., and Bergman, L. R. (1990). A pattern approach to the study of pathways
from childhood to adulthood. In Robins, L. N., and Rutter, M. (eds.), Straight and Devious Pathways from Childhood to Adulthood, Cambridge University Press, Cambridge,
England.
Maltz, M. D. (1994). Deviating from the mean: The declining significance of significance, J.
Res. Crime and Delinq. 31(4): 434–463.
Maltz, M. D. (1995). Criminality in space and time: Life course analysis and the micro-ecology
of crime. In Eck, J., and Weisburd, D. (eds.), Crime and Place, Criminal Justice Press,
Monsey, New York.
Maltz, M. D. (1998). Visualizing homicide: A research note. J. Quant. Criminol. 15(4):
397–410.
Maltz, M. D., and Zawitz, M. W. (1998). Displaying Violent Crime Trends Using Estimates
from the National Crime Victimization Survey, Bureau of Justice Statistics Technical
Report NCJ 167881, June 1998.
Maxfield, M. G., and Babbie, E. (1998). Research Methods in Criminal Justice and Criminology,
Second Edition, Wadsworth, Belmont, CA.
McCord, J. (1990). Long-term perspectives on parental absence. In Robins, L. N., and Rutter,
M. (eds.), Straight and Devious Pathways from Childhood to Adulthood, Cambridge University Press, Cambridge, England.
McCord, J. (1993). Descriptions and predictions: Three problems for the future of criminological research. J. Res. in Crime and Delinq. 30: 412–425.
Meehl, P. E. (1954). Clinical Versus Statistical Prediction: A Theoretical Analysis and a Review
of the Evidence, University of Minnesota Press, Minneapolis, MN.
Meehl, P. E. (1973). Psychodiagnosis: Selected Papers, University of Minnesota Press, Minneapolis, MN.
Moffitt, T. E. (1993). Adolescence-limited and life-course-persistent antisocial behavior: A
developmental taxonomy. Psycholog. Rev. 100(4): 674–701.
Nagin, D. S. (1999). Analyzing developmental trajectories: A semiparametric, group-based
approach. Psycholog. Meth. 4(2): 139–157.
Trajectories
281
Nagin, D. S., and Paternoster, R. (1991). On the relationship of past and future participation
in delinquency. Criminology 29: 163–190.
Post, R. M., Roy-Byrne, P., and Uhde, T. W. (1988). Graphic representation of the life course
of illness in patients with affective disorder. Am. J. Psychiat. 145(7): 844–848.
Richters, J. E. (1997). The Hubble hypothesis and the developmentalist’s dilemma. Devel. and
Psychopath. 9: 193–229.
Sampson, R. J., and Laub, T. (1993). Crime in the Making: Pathways and Turning Points
through Life, Harvard University Press, Cambridge, MA.
Tufte, E. R. (1983). The Visual Display of Quantitative Information, Graphics Press, Cheshire,
CN.
Tufte, E. R. (1990). Envisioning Information, Graphics Press, Cheshire, CN.
Tufte, E. R. (1997). Graphical Explanations, Graphics Press, Cheshire, CN.
Tukey, J. W. (1977). Exploratory Data Analysis, Addison-Wesley, Reading, MA.
Wainer, H. (1997). Visual Revelations: Graphical Tales of Fate and Deception from Napoleon
Bonaparte to Ross Perot, Copernicus (Springer-Verlag), New York, NY.
Wild, C. J. (1994). Embracing the ‘‘wider view’’ of statistics. Am. Statist. 48: 163–171.
Wilkinson, L. (1999). The Grammar of Graphics, Springer-Verlag, New York, NY.