|
Grant Awards
- PI: Mark S. Handcock
(PI -Michael Rendall)
- Funding Agent:NIH (Subcontract from Rand Corp.)
- Amount: $64,144
- Date: August 1, 2007 - July 31, 2009
- Title: "Immigration, Emigration, and Age-by-Country Structure of Mexican
Cohort Lifetimes"
- Abstract:
This research is in part motivated by the very large discrepancy between
the assumptions of the demographic models addressing the question of the
effect of immigration on population aging and the empirical evidence about
the migration processes of the US's single largest immigrant-contributing
country, Mexico. The nature of the demographic models is that they assume
that immigrants settle in the receiving country. The empirical evidence
with regard to Mexico is that large numbers of immigrants do not settle in
the US, and instead return to Mexico. The study aims to understand the
consequences of substantial levels of return migration first on the
likelihood of aging in the US among Mexican-born male and female
immigrants to the US, and second on the selectivity of those who stay in
the US into old age. US and Mexican 1990 and 2000 census microdata are
used together first to estimate a two-region cohort migration model. The
extent to which later migrant streams from Mexico contain more people who
settle in the US and are increasingly balanced by gender is assessed by
comparing late 1980s and late 1990s migration estimates.
The study is a first step towards developing a better understanding the
future impact of Mexican immigration on the Mexican-born population's age
structure and composition by education, family, and health
characteristics. It is also a first step towards a broader understanding
of the impact of immigration in general on the age structure of the US
population and thus on the US's ability to support an older population
with greater health needs.
- PI: Adrian E. Raftery
(PI -Susan Joslyn)
- Funding Agent:NSF (Sub-budget portion)
- Amount: $281,053
- Date: October 1, 2007 - September 30, 2010
- Title: "DRU – Weather Forecast Uncertainty"
- Abstract:
Information about weather forecast uncertainty, which has been available for some
time, is rarely communicated in public forecasts, although it is theoretically
beneficial to weather related decisions with important economic and safety
consequences. One concern is the difficulty the general public might have in
understanding such information. To date, however, very little research has
investigated the psychological processes involved in understanding using weather
forecast uncertainty in realistic contexts among non-expert users. In order to
determine how best to communicate forecast uncertainty to the general public, their
needs and information processing requirements must be first understood. This
project will conduct both naturalistic and experimental research to accomplish
these goals. Then we will develop new probabilistic forecasting methods for
extreme events and warnings, and new methods for verifying their performance.
Finally we will design and create uncertainty products that are compatible with
identified user needs and cognitive requirements. These products will be based on
output from the University of Washington regional ensemble system. Some of these
products, which provide weather warnings for extreme events, will require the
development of innovative probabilistic forecasting methods. Weather warnings for
extreme events have important safety implications but there has been little
attention, from either the psychological or statistical research communities,
given to a probabilistic approach to these issues. Targeted research such as this,
using state of the art uncertainty products among non-expert end users is virtually
unique and will provide important foundational research for the study of
communicating forecast uncertainty.
- PI: Elena Erosheva
- Funding Agent:NIH
- Amount: $121,280
- Date: September 1, 2007 - August 31, 2009
- Title: "Operational Definition of Chronic Disability in the National
Long-Term Care Survey."
- Abstract:
This application requests two years of funding to investigate the
operational definition of chronic disability in the National Long-Term Care
Survey (NLTCS). Published studies that use data or refer to results from the
NLTCS vary in the amount of detail they provide on the definition of chronic
disability employed by the survey. Most of these studies, however,
oversimplify the NLTCS's operational definition of chronic disability by
ignoring longitudinal features of the survey. This practice may lead not
only to erroneous conclusions but also to misspecified policy implications.
The NLTCS began in 1982 and now extends over six waves through 2004. It
provides an important source of information on possible changes in
disability over time among the elderly Americans. The NLTCS data on basic
and instrumental activities of daily living have been used to generate some
major findings such as showing a decline in chronic disability among the
elderly Americans. However, complexity of the design, influenced by many
decisions made in the early years of the survey, presents conceptual and
analytic challenges for secondary users of the NLTCS data. In particular,
the operational definition of chronic disability employed by the NLTCS is
difficult to track down comprehensively. As a result, it often gets
misinterpreted toward an oversimplification. Our preliminary study shows
that the NLTCS by design measures some combination of chronic and short term
disability as opposed to chronic disability as commonly stated in the
literature. The primary aims of this project are to develop a comprehensive
description of the operational definition of chronic disability used in the
NLTCS and to investigate the impact of the design choices made by the NLTCS
on the measurement of chronic disability. This project will illuminate the
interplay between the basic definition of chronic disability, as a
disability lasting more than 90 days, and the complex longitudinal design of
the NLTCS. It will also investigate whether there are subgroups of the
elderly population that are differentially affected by the NLTCS design
choices as they relate to the measurement of chronic disability. Finally, it
will explore whether additional data from the NLTCS can be used to obtain a
valid chronic disability measure. Findings of this study will benefit
secondary users of the NLTCS data and future designers of longitudinal
surveys that aim to track chronic disability status of the elderly over
time.
- PI: Adrian E. Raftery
- Funding Agent:NIH
- Amount: $1,310,400
- Date: August 15, 2007 - May 31, 2011
- Title: "Assessing Uncertainty in
Population Projection Models via Bayesian Melding."
- Abstract:
The goal of our proposal is to develop a statistical framework
for probabilistic population projections and for assessing
uncertainty in linked demographic-disease models. The most common
approach to communicating uncertainty in population projections is
the scenario, or High-Medium-Low, approach, which has no
probabilistic basis and leads to inconsistencies. We propose
Bayesian melding as an alternative that can take account of all the
available evidence and uncertainties about inputs and outputs from
population projection models, to yield a predictive distribution of
any quantity of policy interest. Uncertainty is even more important
for linked demographic-disease models, when the goal is to forecast
future population and disease prevalence in the presence of an
epidemic. The United Nations Population Division has decided to
assess Bayesian melding as a method for assessing uncertainty in its
population projections. UNAIDS has decided to use Bayesian melding
as the basis for assessing uncertainty in their demographic and
prevalence projections. The specific aims of the research will be:
(1) Methodological development of Bayesian melding to assess
probabilistic forecasts, to deal with measurement and systematic
errors, to provide a framework for model improvement, model selection
and model uncertainty, and to develop more computationally efficient
methods. (2) Develop Bayesian melding methods for probabilistic
population projections, including fertility, mortality and
migration. (3) Develop Bayesian melding methods for linked
demographic-disease models, including the incorporation of multiple
data sources, and the assessment of behavior change. (4) Produce and
distribute software implementing the new methods produced by our
research.
- PI: Peter Hoff
- Funding Agent:NSF
- Amount: $400,000
- Date: November 15, 2006 - October 31, 2009
- Title: "Longitudinal Network Modeling of International Relations Data."
- Abstract:
Empirical analyses of international relations data have become one of the
principal methods by which researchers evaluate theories of trade, conflict
and other interactions between countries. For example, regression modeling
has recently been used as a method of evaluating the question of whether or
not the community of democratic countries is inherently peaceful. The data
used in these analyses are inherently longitudinal, involving measured
relations between nations over time. Despite this fact, and that many of the
core approaches in scientifically oriented studies of international politics
spring from strong policy concerns, very seldom do these statistical
modeling efforts account for the temporal nature of the data, or attempt to
gauge the validity of the obtained model-fitting results by comparison to
unfolding events.
To address these issues, we will develop and implement statistical models
for relational data that take into account (a) the complex dependencies
inherent to relational, social network data, and (b) the evolution of
international relations over time. This will be done by extending regression
and latent factor models for network data to the time domain, allowing for
the analysis of complicated longitudinal relational data using tools that
are familiar to social science researchers. We will leverage the
longitudinal nature of the data to evaluate candidate statistical modeling
approaches and estimation methods, and will use the methodology to better
understand the dynamics of international conflict and trade data.
- PI: Sibel Sirakaya
- Funding Agent:UW Royalty Research Fund (RRF)
- Amount: $23,454
- Date: September 16, 2006 - September 15, 2007
- Title: "Sovereign Lending Under Limited Enforcement."
- Abstract:
This paper will develop a model of a two-sector small open economy
with limited enforcement of foreign debt. I will study an incentive
constrained self-enforcing lending scheme in which a debtor's repayment
utility never falls below its default option. A general purpose numerical
method will be developed to carry out the simulations of the model under a
wide range of parameter sets to demonstrate the extent of inefficiencies due
to limited enforcement.
- PI: Adrian E. Raftery
(PI-Alan Borning)
- Funding Agent:NSF (Sub-budget portion)
- Amount: $86,395
- Date: January 1 2006 - December 31 2008
- Title: " Modeling Uncertainty in Land Use and Transportation Policy Impacts:
Statistical Methods, Computational Algorithms, and Stakeholder Interaction."
- Abstract:
In computational statistics, we are developing , analyzing, and validating
techniques for representing and propagating uncertainty through a
sophisticated modeling system. Our approach uses promising but preliminary
results in Bayesian melding. We propose to develop new statistical methods
adapted to the challenges posed by UrbanSim (a sophisticated system to model
urban development), which include model stochasticity, large effects of
measurement and systematic errors, high dimension of model inputs and
outputs, and significant running time for the underlying model. In addition
to the statistical challenges, however, undertaking this approach makes
extreme computational demands; and achieving acceptable performance will
require algorithmic advances, as well as sound software engineering. In
human computer interaction, among the research challenges, are supporting
meaningful stakeholder access to and interaction with complex situations,
including representations of uncertainty. Finally, in the emerging area of
science and design, and important question is: how can we design and
evaluate the system overall, in a principled way, to support such basic
values as accurate presentation of results (including their limitations and
uncertainties) and transparency? If we succeed in this work, UrbanSim has
the potential to significantly aid in public deliberation over major
decisions regarding urban sprawl, economic health, sustainability, and other
issues. Our system is Open Source and freely available, and has already
attracted considerable interest and use. Further, the results in
computational statistics should be applicable to a broad range of
simulations of economic or environmental processes to inform public policy
development and deliberation. Finally, the interaction techniques and
findings should be applicable to a range of other stakeholder interactions
with complex models and sources of information.
- PI: Ross Matsueda
- Funding Agent:NIH
- Amount: $993,535
- Date: September 1 2005 - August 31 2008
- Title: "Life Course Trajectories of Substance Use
and Crime"
- Abstract:
This proposal estimates trajectories of substance use and crime through the
life course, and builds models to explain those trajectories. It uses three
datasets, the Denver Youth Survey, National Youth Survey, and Add Health
Survey. It identifies key risk factors from social learning (coercive
parenting, deliquent peers, deliquent attitudes, delinquent identity),
rational choice (risk of arrest, rewards of drug use), stable trait
(impulsivity), and life course theories (high school graduation, employment,
marriage). The analysis begins by estimating individual growth curves of
marijuana, cigarettes, alcohol, other drugs, and deliquency. It then uses
multi-level models tests three hypotheses: (1) A cormobidity hypothesis in
which a latent variable underlies one or more trajectories. (2) A stable
context and stable trait hypothesis, in which trajectory parameters are
predicited by stable traits like impulsivity, and stable contexts, like SES
and family functioning. (3) A life course hypothesis, in which life course
transitions are treated as time-varying covariates predicting substance use
trajectories. (4) A social process hypothesis, in which process variables,
like deliquent peers or perceived risk of arrest influence trajectories. We
then examine latent classes of trajectories using Nagin's nonparametric
mixed model. We test Moffitt's hypothesis that at least two groups--life
course persistent and adolescence limited--underly trajectories of illegal
behavior, and extned the hypothesis to substance use. We revisit the
cormobidity hypothesis by examining whether cormobidity varies within and
across latent classes. We then test whether contextual variables and stable
traits can explain the group classifications, and using twin data, estimate
genetic effects. Finally, we will test our process and life course theories
by testing whether their effects are moderated by latent classes (e.g., Are
life course persistent drug users immune to the threat of arrest? Do
adolscence limited learn from life course persistents? Such results have
important health policy implications for prevention and education.
- PI: Peter Hoff
- Funding Agent:NSF
- Amount: $150,000
- Date: October 1 2004 - September 30 2006
- Title: "Network Modeling of
International Peace and Trade Data"
- Abstract:
Despite the desire to focus on the interconnected nature of politics
and economics at the global scale, most empirical studies assume that
the major actors are not only sovereign countries, but also that their
relationships are independent. This means, for example, that trade is
often studied without taking into account the interdependence of one
country's trade with another. Similarly in international politics, it
is often assumed that the policies of one country are entirely
independent of the policies in another, even though we may observe
consultation between them. Statistical studies have typically assumed
that these kinds of dependencies must be ignored. In contrast we
employ newly developed statistical methods to reveal these heretofore
hidden interdependencies among both trade and international
politics. In particular, we develop and estimate statistical models
for dependent dyadic data that simultaneously estimate the correlation
of actions having the same initiator, the correlation of actions
having the same recipient, as well as the reciprocity of actions
between a pair of actors and third-order dependencies involving the
clustering of three or more actors. In particular, we re-examine some
of the claims of the democratic peace hypothesis to see whether they
may be explained in part by the dependencies among the actions of
countries. In addition, we also re-examine standard models of
international trade to gauge whether international commerce can be
better understood in the context of dependencies among trading
patterns. Preliminary results suggest that there is considerable
leverage to be gained by focusing on the dependencies in dyadic data
of the kind represented by international trade as well as
international conflict and cooperation.
The application of this approach has the promise of transforming
empirical studies of dyadic, or transactional, data in political
science, geography, and economics. In so doing, it may help to
re-energize examination of the impacts of international dependencies
upon international cooperation and commerce. Understanding the second
and third order dependencies among trading countries will help provide
a clearer picture of the global opportunities and barriers to
increased levels of global trade.
- PI: James Kitts
- Funding Agent:NSF
- Amount: $146,223
- Date: October 1 2004 - March 31 2007
- Title: "Creating Dynamic Social Network
Models from Sensor Data"
- Abstract:
In most situations our decision making is influenced by the actions of
others around us. Informal networks of collaboration that coexist
within the formal structure of the institution and can enhance the
productivity of the organization. The physical structure of an
institution can binder or encourage communication. The dynamics of
communication influences the diffusion of information. Existing
techniques for capturing the relationship between actions and various
environmental and organizational attributes rely heavily on tedious
manual techniques or situation based approaches. We propose a
data-driven approach, where we build a computational framework for
learning the structure and dynamics of social networks automatically
from low-level sensor data.
This effort involves; (i) Collecting data or human activity and
interpersonal interactions using a number or complimentary
technologies, including machine vision, audio analysis, and wearable
GPS (global positioning system) units; (ii) Developing probabilistic
reasoning algorithms that can robustly infer patterns of interaction
even in the face of noisy and incomplete data; and (iii) Modeling and
analyzing these patterns of interaction using probabilistic graphic
models to gain insight into the structure and dynamics of human
communities.
Because our technology allows us to pinpoint the time and place of
individual interactions, our approach allows us to create dynamic
models that reveal the evolution of social networks over time. We
will be able to explore now, for example, communities are reshaped by
stimuli such as the gain or loss of members or changing work
assignments.
- PI: Katherine Stovel
- Funding Agent:NSF
- Amount: $148,527
- Date: April 1 2004 - March 31 2006
- Title: "About a Job: Networks,
Information and Segregation in Labor Markets"
- Abstract:
In this project we study how matching processes and structural
conditions interact to produce various levels of segregation in labor
markets. Empirical evidence reveals that labor markets are often
highly segregated with respect to the ascribed attributes of workers.
Most of the traditional explanations that have been proposed to
account for segregation in labor markets can be classified as either
'supply-side' (worker qualifications or preferences) or 'demand side'
(job requirements or discrimination by employers) accounts. Neither
of these accounts addresses the structure of information that links
potential workers and employers, or how these actors evaluate the
information they do acquire. However, how potential workers hear
about vacant jobs, and how employers view referred employees, are
crucial parts of the hiring process, and have implications for the
level of segregation in a labor market.
Our project has three specific aims: (1) To refine and extend our
existing two-sided matching model of a labor market to incorporate key
aspects of labor market institutions and the information structures
(including networks) that are relevant for recruiting; (2) to
calibrate this model with data describing empirical labor markets; (3)
to use this model as an experimental framework to generate testable
hypotheses about the relative importance of supply-side, demand-side,
and matching based mechanisms that can influence the level of
segregation in a labor market.
- PI: Elena Erosheva
- Funding Agent:NIH, Subcontract with Harvard Medical School
- Amount: $8,784
- Date: November 1 2003 - May 31 2004
- Title: "Epidemiology: National
Comorbidity Survey Replication"
- Abstract:
This project proposes to study patterns of co-morbidity with the Grade
of Membership (GoM) model, a statistical model for discrete data
analysis. The data set contains dichotomous responses which provide
presence or absence of 16 mental disorders from about 10,000
individuals in the U.S., and 250,000 individuals around the world.
Assuming existence of extreme (basis) categories in the data, the GoM
model postulates that individuals can have mixed membership in the
extreme categories. The GoM model was developed in the 1970s and has
been applied to a wide spectrum of health-related studies. This
project proposes to use recently developed Bayesian estimation methods
for analysis of co-morbidity patterns via the GoM model. Specifically,
the goals of the proposal include: (1) to explore whether there is
evidence of extreme categories in the data and whether patterns of
co-morbidity are likely to exhibit mixed membership structure; (2) to
estimate the GoM model parameters using Bayesian framework; and (3) to
determine whether the GoM model provides a reasonable fit for the
co-morbidity data.
- PI: Elena Erosheva
- Funding Agent:NIH, Subcontract with Carnegie Mellon University
- Amount: $192,000
- Date: September 30 2003 - August 31 2006
- Title: "Modeling Longitudinal
Disability Survey Data"
- Abstract:
Survey data on disability among the elderly are available from several
sources, most prominently the Nat-ional Long Term Care Survey
(NLTCS). The NLTCS began in 1982 and now extends over five waves
through 1999, making it a rich source of information on possible
changes in disability over time. But these data pose challenges for
both statistical modeling and the protection of confidentiality of the
information provided by survey respondents, especially when the data
for individuals are linked across waves. Most statistical approaches
used to analyze NLTCS data are based on disability scales that cannot
account for the complexity of disability manifestations. Attempts to
deal with such complexity include traditional multivariate methods for
both discrete and continuous data, and approaches based on the grade
of membership model. These methods typically require either making
heroic simplifying assumptions or need to be adapted. This project
aims to develop new statistical models and approaches for the analysis
of such survey data. It also proposes to take a fresh look at the
risk of inadvertent disclosure of information on NLTCS respondents and
to develop new approaches to protect against disclosure while
preserving access to the maximal amount of information in the data
required for their proper analysis using the new models and
methods.
- PI: Mark S. Handcock
- Funding Agent:NIH/NICHD
- Amount: $1,095,133
- Date: February 1 2003 - January 31 2007
- Project Website: http://www.csss.washington.edu/Research/combining
- Title: "Combining Survey and Population
Data on Births and Family"
- Abstract:
This study's overall objective is to develop statistical methods for
combining surveys and population data collections (especially of
births and marital and non-marital unions) for the improved estimation
of these birth and childhood circumstances. The family and
socio-economic circumstances of children's parents at birth and during
the childrearing years are fundamental determinants of children's
health and well-being. Specific aims are to (1) Develop and test
statistical methods to combine multiple sources of survey data and
population data; (2) Improve estimates of the parameters of fertility
and marital and non-marital union regression equations, and of
simulated life-course fertility and union duration measures; and (3)
to expand and disseminate the statistical capabilities to the
demographic community. It will be shown that combining population and
survey data in the estimation allows for more modeling detail than
when using population data alone, and more precise estimates than when
using survey data alone. Further statistical development will allow
for survey data to be combined from more than one data set, thereby
obtaining some of the same benefits as from combining survey and
population data. Methods for incorporating degrees of inaccuracy in
the population data, and imperfect matches between the population
collection and the survey's sampling frame and collection methods,
will also be developed and applied. Comparative applications of the
methods between the U.S. and the U.K. will be made to explore their
advantages and challenges over a greater range of population data
collection types than available in the U.S. alone. Applications across
multiple developed countries will demonstrate that methods for
combining survey and population data can be used to overcome the
otherwise severe restrictions placed on cross-national comparisons.
- PI: Elena Erosheva
- Funding Agent:The Center for Statistics and the Social Sciences
- Amount: $37,424
- Date: 2003-2004 Academic Year
- Title: "Statistics and Social Work
Collaborative Research Initiative."
- Abstract:
This project initiates a development of collaborative research between the
Center for Statistics and the Social Sciences and the School of Social Work.
Potential involvement includes collaborative work on two projects lead by
senior Social Work faculty members, Roger Roffman and David Takeuchi.
Roger Roffnam's project is an intervention study that focuses on adults who
are both batterers and substance abusers. The primary interest is in
developing outcome measures that would allow the assessment of the efficacy
of the experimental intervention. The study plan is unique in its focus on a
group of people who are both substance abusers and are abusive to their
intimate partners. Currently existing measurement and intervention
procedures have been developed with the focus on either substance abuse or
domestic violence. Studying the co-occurrence of these two behaviors will
require new methodological developments in this area.
David Takeuchi's research examines how mental illness and medical care are
distributed across race, ethnicity, and socio-economic status. Takeuchi's
current research project is the National Latino and Asian American Study
(NLAAS) that is intended to investigate the social, cultural, and contextual
factors that are associated with mental illness and helpseeking in these
large ethnic categories. One facet of NLAAS includes extensive questions of
quality of life to better understand how Asian Americans and Latinos cope
with stressful conditions of life. Sophisticated statistical analyses have
not been typically performed on these quality of life indicators. The goals
of this research include determining the psychometric properties of these
scale items and assessing the social and cultural factors associated with
these different levels of functional status.
- PI: Adrian E. Raftery
- Funding Agent: Office of Naval Research
- Amount: $5,156,827
- Date: May 1, 2001
- Title: "Integration & visualization of
multi-source information for mesoscale meteorology: statistics & cognitive
approaches to visualizing uncertainty."
- Abstract:
Current methods of meteorological forecasting produce predictions with
unknown levels of uncertainty, particularly in regions with few
observational assets. Forecast errors and uncertainties also arise from
shortcomings in model physics. With the ability to estimate the uncertainty
in predictions, forecasters would have a powerful tool to make decisions and
to judge the likelihood of mission success.
The goals of our proposed project are to develop methods for evaluating the
uncertainty of mesoscale meteorological model predictions, and to create
methods for the integration and visualization of multisource information
derived from model output, observations and expert knowledge. We will do
this by extending the recently developed Bayesian melding approach. We will
also develop statistical methods for combining results from model ensembles,
taking account of model uncertainty. This will build on the general idea of
Bayesian model averaging. We will also develop tools and methods for
visualizing predictions of quantities of interest and the uncertainty about
them by (i) choosing appropriate quantities of interest for display based on
cognitive factors, and (ii) developing appropriate plots, maps,
three-dimensional displays, and video displays for decision support.
- PI: Mark S. Handcock
- Funding Agent: National Institutes of Health
- Amount: $259,352
- Date: July 1, 2001
- Title: "Modeling HIV and STDs in Drug User and Sexual Networks"
- Abstract:
Infectious diseases are distinguished from other diseases by being
transmissible. Our understanding of disease transmission, and the
preventive strategies that arise from such understanding, are therefore
rooted in an implicit or explicit theory of population transmission
dynamics. For infectious diseases like STDs and BBIs, that are only
transmitted through the exchange of bodily fluids, the structure of the
transmission network plays a particularly critical role. The epidemiology
of these diseases - how quickly they spread and who gets infected - is
driven by the network of person-to-person contact. Mathematical models of
this process have provided a number of insights that have led to changes in
STD control strategies. With the advent of HIV, however, new modeling
challenges have emerged. In this research we develop new models for drug
user and sexual networks as a means to understand the factors that influence
the spread of HIV and other STDs.
- PI: Mark S. Handcock
- Funding Agent: National Science Foundation
- Amount: $23,526
- Date: January 1, 2001
- Title: "Collaborative Research: Hybrid
Population-Average and Individual-Specific Models for Clustered Longitudinal
Data"
- Abstract:
We propose the merger of two good ideas in the context of social and
behavioral statistical models for longitudinal data. The first, model-based
clustering, allows the researcher to locate subgroups in a population,
should they exist, for a larger class of datasets. The second is to model
individual-specific variation using data-adaptive proto-splines. We will
extend the set of available heterogeneity models for longitudinal data to
include several mean and covariance structures via latent classes, and in
turn model the covariance structures using the adaptive proto-spline class
of Hancock and Scott (1999).
We will also develop the theory and practice of inference for these models,
develop objective measures for model comparison and model goodness-of-fit,
and will also make appropriate software publicly available.
As a case study of the use of these methods, we will analyze long-term
trends in wage inequality and the dynamics of change of these trends in the
period from the mid-1960s to the mid-1990s. The analysis will be based on
young workers from the National Longitudinal Survey (NLS). This analysis
will be important to help characterize changes in the experiences of workers
in the post-industrial economy.
- PI: Mark S. Handcock
- Funding Agent: National Science Foundation
- Amount: $27,957
- Date: August 1, 2000
- Title: "Collaborative Research: Nonparametric
Models for Incomplete Clustered Data with Applications to the Social Sciences"
- Abstract:
Clustered data are very common in social sciences research and other fields.
For example, in a study involving school children, school districts form
clusters and schools form sub-clusters within each cluster. In this
context, researchers want to explain a certain variable of interest (the
response variable) in terms of certain categorical variables (factors) while
adjusting for the presence of other incidental variables (covariates) which
might influence the response. This project aims at developing statistical
methods for analyzing such data. Though the classical statistical methods
accommodate the lack of independence which is inherent to data arising from
cluster sampling, they are very often unsuitable for data arising from
social science research. This is because they require a set of restrictive
assumptions (such as normality and homogeneity of the residuals, linearity,
scale dependence) which are rarely satisfied in the social sciences. In
addition, data in social sciences research are often incomplete (censored or
missing) in which case inference based on the classical statistical models
cannot be implemented. Alternative approaches developed to deal with these
issues also rely on assumptions which may or may not be satisfied for any
given application. The research for this project will focus on the
development of statistical models and methods that are free of restrictive
assumptions. Central components of the project is the application of these
methods to questions regarding routine activities and deviant behavior, and
to the question of whether there has been a secular rise in job instability
among young adults over the past three decades using two cohorts from the
National Longitudinal Survey (NLS). Programs for formal hypothesis testing,
graphical summaries of effects and exploratory data analysis plots, will be
made available on the web for use by the social sciences community.
- PI: Elaina Rose
- Funding Agent: National Institutes of Health
- Amount: $71,054
- Date: September, 2002
- Title: "Marriage and Assortative Mating"
- Abstract:
The role of marriage has undergone profound change in recent decades.
Changes in the patterns of "assortative mating," i.e., in the types of
partners that individuals choose when they do form unions likely accompany
the changes in marriage patterns. The objective of this proposal is to
expand the literature on marriage and assortative mating by refining the
estimates of marriage and assortative mating patterns, and developing an
econometric model of the joint union status and partner choice outcomes.
The proposal includes a pilot study of the patterns in marriage and
assortative mating with respect to education which suggests the following
specific research questions: (1.) How does the relationship between
education and the likelihood of marriage differ when cohabitors are treated
as married couples? (2) Are assortative mating patterns different for
cohabiting and married couples? (3.) What are the patterns in assortative
mating with respect to characteristics such as parents' education and
"unobserved ability"? (4.) Can the cohort differences in the marriage
patterns be explained by differences in observables such as education,
family policy, or marriage market conditions? (5.) Can changes in
assortative mating be explained by changes in the pattern of selection into
marriage? (6.) Do women face a tradeoff between partner quality and union
"cohesion"?
- PI: Adrian E. Raftery
- Funding Agent: National Institutes of Health
- Amount: $1,090,322
- Date: September, 2002
- Title: "Model-Based Clustering Methods for Medical Images"
- Abstract:
Many problems in the health and medical sciences have at their core the task
of finding cohesive groups of observations in data. Examples include a group
of voxels in an MRI image that correspond to a tumor, genes whose mRNA
expression levels track one another, and tissues whose gene expression
patterns are similar. The statistical method for solving this problem is
cluster analysis. Most cluster analysis methods used in practice have been
ad hoc, but recently the development of more formal model-based clustering
methods has provided a principled framework for answering central questions
such as: How many clusters are there? Which clustering method should be
used? How should one deal with outliers?
Our main goal is to develop new methods for problems in model-based
clustering that arise in medical image
segementation and gene expression data. The three major thrusts will be the
development of: (A) model-based clustering methods for large numbers of
variables; (B) automated medical image segementation methods appropriate for
dynamic MRI breast images; and (C) model-based clustering methods for
microarray gene expression data aimed at finding groups of genes that
function together, and groups of tissues or tissue types that have similar
gene expression patterns.
- PI: Kevin Quinn
- Funding Agent: National Science Foundation
- Amount: $51,133
- Date: September, 2002
- Title: "Collaborative Research: The Dimensions of Supreme Court Decision-Making, 1946-2000"
- Abstract:
We propose new statistical models that can be used to gain a better
understanding of the dynamics of decision making on the U.S. Supreme Court.
Substantively, we hope to obtain better answers to the following questions:
To what extent do the policy preferences of justices outweigh purely formal,
legal concerns when deciding cases? Have the decisions of lower courts
become more liberal over time? In what manner have the policy outputs of the
Court changed over time? The small number of
justices on the Court creates a number of potential inferential problems. To
alleviate these problems we adopt a Bayesian inferential approach. An
additional benefit of such an approach is that it allows us to include
previous qualitative work on the Court into our statistical models in a
fairly direct fashion. The models we propose allow us to simultaneously
estimate the unobserved ideal policy positions of the justices, the
unobserved policy content of the lower courts' rulings, and the effects of
measured covariates on the decision calculus of individual justices.
- PI: Peter Hoff
- Funding Agent: Office of Naval Research
- Amount: $275,000
- Date: September, 2002
- Title: "Statistical Modeling of Dependent Network Data"
- Abstract:
Network data summarizes relational information among interacting units, and
are common in many areas of research. Applications include international
conflict, international trade, telephone calling patterns, chain-of-command
networks in businesses and other organizations, the behavior of epidemics
and the interconnectedness of the world wide web. Such data differs from
standard data in that it consists of observations on pairs of
experimental units, and that the observations among pairs are typically not
independent, but dependent in complicated ways. Past efforts at modeling
dependencies in networks
have focused on exponentially parameterized random-graph models (often
referred to as the p* class of models), which have been difficult to
estimate and often give a poor fit to actual network data. Additionally,
such models have focused on the case of binary responses, and have
difficulty modeling common types of network data such as continuous, count,
time-series, and multivariate data. In contrast, the proposed project will
develop a flexible modeling strategy for dependent network data using a
novel random effects approach, which can easily be incorporated within
well-known statistical methods such as linear regression, generalized linear
models, semiparametric regression, and others. Preliminary results suggest
such an approach has several advantages over current practice. The proposed
approach allows for prediction and hypothesis testing; lends itself to a
model-based method of network visualization; is highly extendible and
interpretable in terms of well known statistical procedures; and has a
feasible means of exact parameter estimation.
|