Program of the 8th German Stata Users Group Meeting - Methoden - mailman.uni-konstanz.de

17 May 2010


      <>
The eighth German Stata Users Group meeting will be held at Berlin
Graduate School of Social Sciences on Friday, June 25, 2010.  We would
like to invite everybody from everywhere who is interested in using
Stata to attend this meeting. The conference language will be English
because of the international nature of the meeting and the
participation of non-German guest speakers.
The meeting will include presentations about causal models, general
statistics, and data management, both by researchers and by StataCorp
staff. The meeting will also include a "wishes and grumbles" session,
during which you may air your thoughts to Stata developers. Finally,
there is (at additional cost) the option of an informal meal at a
Berlin restaurant on Friday evening.
Details about accommodations and fees are given below the program.
Venue
-----
Berlin Graduate School of Social Sciences 
Luisenstraße 56 
10117 Berlin 
http://www.bgss.hu-berlin.de/
Conference Website
------------------
http://www.stata.com/meeting/germany10/
See http://www.stata.com/meeting/proceedings.html for the proceedings of
other Users Group meetings.
Program 
-------
8:45-9:15 Registration
9:15-10:15 Biometrical modeling of twin and family data in Stata    
Sophia Rabe-Hesketh  
University of California at Berkely  
sophiarh@berkeley.edu
Data on twins or other types of family structures (e.g., nuclear
families, siblings, or cousins) can be used to estimate the
proportion of variability in observed traits or "phenotypes" that
is due to genes.  The models are essentially multivariate
regression models with residual covariance structures dictated by
Mendelian genetics.  Usually, specialised software for structural
equation modeling is used. However, the required covariance
structures can also be produced using mixed models by specifying
an appropriate design matrix for the random part of the
model. Stata's xtmixed command can then be used to estimate the
models.  For binary phenotypes, such as diabetes, the appropriate
probit models can be estimated using gllamm .
10:15-11:15 An introduction to matching methods for causal
inference and their implementation in Stata    
Barbara Sianesi  
Institute for Fiscal Studies
barbara_s@ifs.org.uk
Matching, especially in its propensity score flavours, has become
an extremely popular evaluation method. Matching is in fact the
best available method for selecting a matched (or re-weighted)
comparison group which `looks like' the (treatment) group of
interest.
In this talk I will introduce matching methods within
the general problem of causal inference, highlight their
strengths and weaknesses and offer a brief overview of different
matching estimators. Using psmatch2 , I will then go through a
practical example in Stata based on real data to implement some
of these estimators as well as to highlight a number of
implementational issues
11:15-11:30 Coffee
11:30-12:00 Heterogeneous Treatment Effect Analysis 
Ben Jann  
ETH Zürich  
jannb@ethz.ch
Methods for causal inference and the estimation of treatment
effects have received much attention in recent years. Most of the
methodological and applied work focuses on the identification of
so called average treatment effects (ATE), possibly restricted to
the treated (ATT) or the untreated (ATU). However, treatment
effects may vary (hence the averaging) and it can be interesting
to analyze the patterns of effect heterogeneity. In this talk I
will present a new command called hte that is used for
heterogeneous treatment effect analysis in Stata.  hte first
constructs balanced propensity score strata and, within each
stratum, estimates the average treatment effect.  hte then tests
for linear trend in effects across the strata. The
stratum-specific treatment effects and the estimated linear trend
are displayed in a twoway graph. The hte resulted from joint work
with Jennie E. Brand (UCLA) and Yu Xie (University of Michigan).
12:00-12:30 Estimation of Linear Fixed-Effects models with
individual-specific slopes in Stata
Volker Ludwig  
Mannheim Centre for European Social Research (MZES)  
vludwig@mail.uni-mannheim.de
Fixed-effects regression is considered a powerful method to
estimate causal effects with survey data. However, in the linear
model the conventional technique of time-demeaning does not yield
consistent estimates of the parameters when unobserved
heterogeneity is not time-constant. Wooldridge (2002: 317ff.)
derived a general model for the situation where unobserved and
observed characteristics of individuals interact to produce the
outcome. The Fixed-Effects model with Individual constants and
Slopes (FEIS) is a remedy for biased coefficients due to, for
example, maturation or learning where unobserved traits affect
individual growth curves differently for treated and controls.
The Stata ado xtfeis implements the FEIS estimator in Mata,
allowing for individual constants and (potentially many)
slopes. Without specifying slope variables, the model collapses to
the conventional model estimated by   xtreg, fe  that accounts
for individual constants only. The ado implements standard errors
that are robust to serial correlation or heteroskedasticity of
unknown form. Estimates of the slope parameters are available
optionally. The command requires panel data with at least J+1
observations per unit, where J is the number of individual-specific
slope variables (including usually, but not necessarily, also the
individual-specific constant).  I will present results for the
effect of marriage on male wages based on real data (GSOEP and NLSY)
to demonstrate the practical relevance of the method. Simulation
results will be used to assess robustness of the estimator to
autocorrelation, measurement error and misspecification of
functional form.
12:30-13:30 Lunch
13:30-14:00 Implementation of a multinomial logit model
with fixed effects    
Klaus Pforr  
Mannheim Centre for European Social Research (MZES)  
klaus.pforr@mzes.uni-mannheim.de
Fixed effect models have become increasingly popular in the field
of sociology. The possibility to control for unobserved
heterogeneity makes these models a prime tool for causal
analysis.  As of today, fixed effects models have been derived
and implemented for many statistical software packages for
continuous, dichotomous and count-data dependent variables, but
there are still many important and popular statistical models,
for which only population-average estimators are available such
as models for multinomial categorical dependent variables. In a
seminal paper by Chamberlain (1980) such a model has been
derrived. Possible applications would be analyses of effects on
employent status with special consideration of part-time or
irregular employment, and analyses of the effects on voting
behavior, that impicitly control for long-time party
identification rather than having to measure it directly. This
model has not been implemented in any statistical software
package, yet.  In this presentation I show an ado, that closes
this gap. The implementation draws on the native Stata
multinomial logit and conditional logit model
implementations. The actual ml evaluator utilizes mata functions
to implement the conditional likelihood function. Finally, to
analyze the numerical stability of the implementation, some basic
simulation results are shown.
14:00-14:45 Generalized method of moments estimators in Stata 
David Drukker  
StataCorp  
ddrukker@stata.com
Stata 11 has new command gmm for estimating parameters by the
generalized method of moments (GMM).  gmm can estimate the
parameters of linear and nonlinear models for cross-sectional,
panel, and time-series data. In this presentation, I provide an
introduction to GMM and to the gmm command.
14:45-15:15 Analyzing proportions 
Maarten Buis  
University of Tübingen  
maarten.buis@uni-tuebingen.de
In this talk I will discuss some the techniques available in
Stata for analyzing dependent variables that are proportions. I
will discuss four programs: -betafit-, -glm-, -dirifit-, and
-fmlogit-. The first two deal with the situation where we want to
explain only one proportion, while the latter two deal with the
situation where we have for each observation multiple proportions
which must add up to one. I will focus on how to interpret the
results of these models, and on a discussion of the relative
strengths and weaknesses of these models.
15:15-15:30 Coffee
15:30-16:00 User-written Stata program: Agrm 
Alejandro Ecker  
University of Mannheim  
aecker@mail.uni-mannheim.de
In the context of his research on perceptual agreement, van der
Eijk (2001) indicates that empirical measures which resort to the
standard deviation of the response distribution capture not only
consensus but skewedness as well and are thus inappropriate
measures of agreement. His alternative measure of agreement “A”
circumvents this problem and yields unbiased figures for all
kinds of ordered rating scales. It first decomposes the frequency
distribution into constituent layers, i.e. row vectors, for which
consensus can be unambiguously defined and then computes the
weighted average degree of agreement.  Given the lack of a
corresponding ado-file, the user-written agrm command allows to
directly calculate van der Eijk’s index of agreement ``A'' in
Stata. Besides a broad range of basic programming features such
as low-level parsing and specifying additional program options,
it also entails more advanced techniques such as the handling of
empty categories and that of numerical missing values. Moreover,
it highlights the potential of nested loops and local macros in
the context of multiple permutations. Finally, the agrm command
is especially suited for showing how Stata’s matrix language,
Mata, provides a powerful environment for handling vectors and
matrices.
Reference: van der Eijk, Cees. 2001. Measuring Agreement in Ordered
Rating Scales. Quality & Quantity 35 (3): 325-341.
16:00-16:30 Yet another program to create publication
quality tables
Tamas Bartus
Institute of Sociology and Social Policy, Corvinus University, Budapest,
tamas.bartus@uni-corvinus.hu
Stata users developed several programs in order to create
publication quality documents containing regression
results (outreg, outreg2, outtex, estout), tables of
statistics (tabout) and contents of matrices
(outtable). So far, less effort was made to enable the easy
publication of other kinds of tables, like those displaying the
definitions of variables and summary statistics. Although the
sophisticated estout package can create tables other than
regression results, the underlying mechanism of posting results
as if they were estimation results has limitations, and removing
these limitations should involve additional programming. The
user-written command publish- (working title) is intended for
users with limited knowledge in programming. It creates
publication quality HTML, MS Word or Latex documents which may
consist of tables displaying definitions of variables, codebooks,
summary statistics, one-way and two-way tables of frequencies,
tables of various statistics and estimation results. Large tables
where results are separately shown for various subsamples or for
several crosstabulations with a common dependent variable, or
tables combining different sorts of elementary tables can easily
be created. Users can also publish matrices, part of the data in
memory, and are allowed to create empty tables into which results
from other tables can be pasted. Controlling the layout of the
table and the controlling of column (and supercolumn) titles is
also easy through a small number of common option.
16:30-17:00 RDS-a Stata Program for Respondent Driven Sampling
Matthias Schonlau  and Elisabeth Birkner
DIW and Rand Corporation 
mschonlau@diw.de, ebirkner@diw.de
Respondent driven sampling (RDS) is a sampling technique
typically employed for hard-to-reach populations (e.g. homeless,
people with AIDS, immigrants). Briefly, initial seed respondents
recruit additional respondents from their network of friends. The
recruiting process repeats iteratively, thereby forming long
referral chains. It is crucial to obtain estimates of
respondents’ network size (e.g. number of friends with the
characteristic of interest).  RDS shares some similarities with
snowball sampling, but the theoretical foundation for inference
using RDS samples is much stronger. We will give a brief overview
over this technique and introduce a new user-written Stata
command for RDS.
17:00-17:15 Coffee
17:15-17:45 Report to the Users    
Bill Gould StataCorp
wgould@stata.com
Bill Gould, President of StataCorp and head of development, talks
about Stata
17:45-18:30 Whishes and Grumbles
Registration and accommodations 
-------------------------------
Please travel at your own expense. The conference fee will be 35 Euro
(Students 15 Euro) to cover costs for coffee, teas, and
luncheons. There will also be an optional informal meal at a
restaurant in Berlin on Friday evening at additional cost.
You can enroll by contacting Anke Mrosek ( anke.mrosek@dpc.de )
by email or by writing, phoning, or faxing
Anke Mrosek 
Dittrich & Partner Consulting GmbH 
Kieler Str. 17 
42697 Solingen 
Germany
Tel: +49 (0) 212 260 66 24 
Fax:+49 (0) 212 260 66 66
Scientific Organizers 
---------------------
The academic program of the meeting is being organized by
Johannes Giesecke ( giesecke@wzb.eu ), Martin Groß
(martin.gross@sowi.hu-berlin.de ) and Ulrich
Kohler (kohler@wzb.eu ).
Logistics organizers 
--------------------
The logistics are being organized by Dittrich and Partner
(http://www.dpc.de), the distributor of Stata in several
countries including Germany, The Netherlands, Austria, Czech
Republic, and Hungary.