<>
The eighth German Stata Users Group meeting will be held at Berlin Graduate School of Social Sciences on Friday, June 25, 2010. We would like to invite everybody from everywhere who is interested in using Stata to attend this meeting. The conference language will be English because of the international nature of the meeting and the participation of non-German guest speakers.
The meeting will include presentations about causal models, general statistics, and data management, both by researchers and by StataCorp staff. The meeting will also include a "wishes and grumbles" session, during which you may air your thoughts to Stata developers. Finally, there is (at additional cost) the option of an informal meal at a Berlin restaurant on Friday evening.
Details about accommodations and fees are given below the program.
Venue -----
Berlin Graduate School of Social Sciences Luisenstraße 56 10117 Berlin http://www.bgss.hu-berlin.de/
Conference Website ------------------
http://www.stata.com/meeting/germany10/
See http://www.stata.com/meeting/proceedings.html for the proceedings of other Users Group meetings.
Program -------
8:45-9:15 Registration
9:15-10:15 Biometrical modeling of twin and family data in Stata Sophia Rabe-Hesketh University of California at Berkely sophiarh@berkeley.edu
Data on twins or other types of family structures (e.g., nuclear families, siblings, or cousins) can be used to estimate the proportion of variability in observed traits or "phenotypes" that is due to genes. The models are essentially multivariate regression models with residual covariance structures dictated by Mendelian genetics. Usually, specialised software for structural equation modeling is used. However, the required covariance structures can also be produced using mixed models by specifying an appropriate design matrix for the random part of the model. Stata's xtmixed command can then be used to estimate the models. For binary phenotypes, such as diabetes, the appropriate probit models can be estimated using gllamm .
10:15-11:15 An introduction to matching methods for causal inference and their implementation in Stata Barbara Sianesi Institute for Fiscal Studies barbara_s@ifs.org.uk
Matching, especially in its propensity score flavours, has become an extremely popular evaluation method. Matching is in fact the best available method for selecting a matched (or re-weighted) comparison group which `looks like' the (treatment) group of interest. In this talk I will introduce matching methods within the general problem of causal inference, highlight their strengths and weaknesses and offer a brief overview of different matching estimators. Using psmatch2 , I will then go through a practical example in Stata based on real data to implement some of these estimators as well as to highlight a number of implementational issues
11:15-11:30 Coffee
11:30-12:00 Heterogeneous Treatment Effect Analysis Ben Jann ETH Zürich jannb@ethz.ch
Methods for causal inference and the estimation of treatment effects have received much attention in recent years. Most of the methodological and applied work focuses on the identification of so called average treatment effects (ATE), possibly restricted to the treated (ATT) or the untreated (ATU). However, treatment effects may vary (hence the averaging) and it can be interesting to analyze the patterns of effect heterogeneity. In this talk I will present a new command called hte that is used for heterogeneous treatment effect analysis in Stata. hte first constructs balanced propensity score strata and, within each stratum, estimates the average treatment effect. hte then tests for linear trend in effects across the strata. The stratum-specific treatment effects and the estimated linear trend are displayed in a twoway graph. The hte resulted from joint work with Jennie E. Brand (UCLA) and Yu Xie (University of Michigan).
12:00-12:30 Estimation of Linear Fixed-Effects models with individual-specific slopes in Stata Volker Ludwig Mannheim Centre for European Social Research (MZES) vludwig@mail.uni-mannheim.de
Fixed-effects regression is considered a powerful method to estimate causal effects with survey data. However, in the linear model the conventional technique of time-demeaning does not yield consistent estimates of the parameters when unobserved heterogeneity is not time-constant. Wooldridge (2002: 317ff.) derived a general model for the situation where unobserved and observed characteristics of individuals interact to produce the outcome. The Fixed-Effects model with Individual constants and Slopes (FEIS) is a remedy for biased coefficients due to, for example, maturation or learning where unobserved traits affect individual growth curves differently for treated and controls.
The Stata ado xtfeis implements the FEIS estimator in Mata, allowing for individual constants and (potentially many) slopes. Without specifying slope variables, the model collapses to the conventional model estimated by xtreg, fe that accounts for individual constants only. The ado implements standard errors that are robust to serial correlation or heteroskedasticity of unknown form. Estimates of the slope parameters are available optionally. The command requires panel data with at least J+1 observations per unit, where J is the number of individual-specific slope variables (including usually, but not necessarily, also the individual-specific constant). I will present results for the effect of marriage on male wages based on real data (GSOEP and NLSY) to demonstrate the practical relevance of the method. Simulation results will be used to assess robustness of the estimator to autocorrelation, measurement error and misspecification of functional form.
12:30-13:30 Lunch
13:30-14:00 Implementation of a multinomial logit model with fixed effects Klaus Pforr Mannheim Centre for European Social Research (MZES) klaus.pforr@mzes.uni-mannheim.de
Fixed effect models have become increasingly popular in the field of sociology. The possibility to control for unobserved heterogeneity makes these models a prime tool for causal analysis. As of today, fixed effects models have been derived and implemented for many statistical software packages for continuous, dichotomous and count-data dependent variables, but there are still many important and popular statistical models, for which only population-average estimators are available such as models for multinomial categorical dependent variables. In a seminal paper by Chamberlain (1980) such a model has been derrived. Possible applications would be analyses of effects on employent status with special consideration of part-time or irregular employment, and analyses of the effects on voting behavior, that impicitly control for long-time party identification rather than having to measure it directly. This model has not been implemented in any statistical software package, yet. In this presentation I show an ado, that closes this gap. The implementation draws on the native Stata multinomial logit and conditional logit model implementations. The actual ml evaluator utilizes mata functions to implement the conditional likelihood function. Finally, to analyze the numerical stability of the implementation, some basic simulation results are shown.
14:00-14:45 Generalized method of moments estimators in Stata David Drukker StataCorp ddrukker@stata.com
Stata 11 has new command gmm for estimating parameters by the generalized method of moments (GMM). gmm can estimate the parameters of linear and nonlinear models for cross-sectional, panel, and time-series data. In this presentation, I provide an introduction to GMM and to the gmm command.
14:45-15:15 Analyzing proportions Maarten Buis University of Tübingen maarten.buis@uni-tuebingen.de
In this talk I will discuss some the techniques available in Stata for analyzing dependent variables that are proportions. I will discuss four programs: -betafit-, -glm-, -dirifit-, and -fmlogit-. The first two deal with the situation where we want to explain only one proportion, while the latter two deal with the situation where we have for each observation multiple proportions which must add up to one. I will focus on how to interpret the results of these models, and on a discussion of the relative strengths and weaknesses of these models.
15:15-15:30 Coffee
15:30-16:00 User-written Stata program: Agrm Alejandro Ecker University of Mannheim aecker@mail.uni-mannheim.de
In the context of his research on perceptual agreement, van der Eijk (2001) indicates that empirical measures which resort to the standard deviation of the response distribution capture not only consensus but skewedness as well and are thus inappropriate measures of agreement. His alternative measure of agreement “A” circumvents this problem and yields unbiased figures for all kinds of ordered rating scales. It first decomposes the frequency distribution into constituent layers, i.e. row vectors, for which consensus can be unambiguously defined and then computes the weighted average degree of agreement. Given the lack of a corresponding ado-file, the user-written agrm command allows to directly calculate van der Eijk’s index of agreement ``A'' in Stata. Besides a broad range of basic programming features such as low-level parsing and specifying additional program options, it also entails more advanced techniques such as the handling of empty categories and that of numerical missing values. Moreover, it highlights the potential of nested loops and local macros in the context of multiple permutations. Finally, the agrm command is especially suited for showing how Stata’s matrix language, Mata, provides a powerful environment for handling vectors and matrices.
Reference: van der Eijk, Cees. 2001. Measuring Agreement in Ordered Rating Scales. Quality & Quantity 35 (3): 325-341.
16:00-16:30 Yet another program to create publication quality tables Tamas Bartus Institute of Sociology and Social Policy, Corvinus University, Budapest, tamas.bartus@uni-corvinus.hu
Stata users developed several programs in order to create publication quality documents containing regression results (outreg, outreg2, outtex, estout), tables of statistics (tabout) and contents of matrices (outtable). So far, less effort was made to enable the easy publication of other kinds of tables, like those displaying the definitions of variables and summary statistics. Although the sophisticated estout package can create tables other than regression results, the underlying mechanism of posting results as if they were estimation results has limitations, and removing these limitations should involve additional programming. The user-written command publish- (working title) is intended for users with limited knowledge in programming. It creates publication quality HTML, MS Word or Latex documents which may consist of tables displaying definitions of variables, codebooks, summary statistics, one-way and two-way tables of frequencies, tables of various statistics and estimation results. Large tables where results are separately shown for various subsamples or for several crosstabulations with a common dependent variable, or tables combining different sorts of elementary tables can easily be created. Users can also publish matrices, part of the data in memory, and are allowed to create empty tables into which results from other tables can be pasted. Controlling the layout of the table and the controlling of column (and supercolumn) titles is also easy through a small number of common option.
16:30-17:00 RDS-a Stata Program for Respondent Driven Sampling Matthias Schonlau and Elisabeth Birkner DIW and Rand Corporation mschonlau@diw.de, ebirkner@diw.de
Respondent driven sampling (RDS) is a sampling technique typically employed for hard-to-reach populations (e.g. homeless, people with AIDS, immigrants). Briefly, initial seed respondents recruit additional respondents from their network of friends. The recruiting process repeats iteratively, thereby forming long referral chains. It is crucial to obtain estimates of respondents’ network size (e.g. number of friends with the characteristic of interest). RDS shares some similarities with snowball sampling, but the theoretical foundation for inference using RDS samples is much stronger. We will give a brief overview over this technique and introduce a new user-written Stata command for RDS.
17:00-17:15 Coffee
17:15-17:45 Report to the Users Bill Gould StataCorp wgould@stata.com
Bill Gould, President of StataCorp and head of development, talks about Stata
17:45-18:30 Whishes and Grumbles
Registration and accommodations -------------------------------
Please travel at your own expense. The conference fee will be 35 Euro (Students 15 Euro) to cover costs for coffee, teas, and luncheons. There will also be an optional informal meal at a restaurant in Berlin on Friday evening at additional cost.
You can enroll by contacting Anke Mrosek ( anke.mrosek@dpc.de ) by email or by writing, phoning, or faxing
Anke Mrosek Dittrich & Partner Consulting GmbH Kieler Str. 17 42697 Solingen Germany
Tel: +49 (0) 212 260 66 24 Fax:+49 (0) 212 260 66 66
Scientific Organizers ---------------------
The academic program of the meeting is being organized by Johannes Giesecke ( giesecke@wzb.eu ), Martin Groß (martin.gross@sowi.hu-berlin.de ) and Ulrich Kohler (kohler@wzb.eu ).
Logistics organizers --------------------
The logistics are being organized by Dittrich and Partner (http://www.dpc.de), the distributor of Stata in several countries including Germany, The Netherlands, Austria, Czech Republic, and Hungary.
methoden@mailman.uni-konstanz.de