Economics of Faith, Race and Community: Getting Started (June 6-13)

I took notes as the weeks of my project progressed. The blog posts that follow are summaries of my work, my struggles, and my overall thought progress throughout the summer project.

June 6-13: Getting Started


As per my abstract, my project intends to is to replicate and then revise Jonathan Gruber’s 2005 paper that examines religious market structure, religious behavior, and economic outcomes. Gruber uses census data from both the Integrated Public Use Microdata Series (IPUMS) and the General Social Survey (GSS)–my first plan of action is to retrieve that data and see what I am working with.

My simplified religious behavior and religious market structure model:

Religious attendance = religious density + ancestral group density + ancestry + age + sex + year + error

My outcomes models:

Income = predicted religious density + ancestry density + age + sex + year

Only two days into the project and I am already struggling. While simultaneously reviewing my econometrics from my sophomore fall and Stata, I also had to collect my data. One thing that I learned almost immediately was that (IPUMS) and the General Social Survey (GSS) websites are not really user friendly, taking me the first two days to figure out how to download the data from each census.

While I wait for Stata to finish reading the data files, of which are both at least 15 Gigabytes big, I write my formulas and code to get started right away on replicating Gruber’s work. Obviously, Gruber did not put in instructions in how he went about his work; I need to figure out on my own which variables he used from the censuses and how he coded that in his data analysis.  He also created this new variable called religious density, which is the share of people that shares a person’s religion in their local population. This is the part I find very interesting and is central to my research’s focus: how do the religious communities themselves influence people’s lives? To create the religious density variable, in the data, I need to find something that identifies the location of the respondent, and then finding others that have that same location. I have to total up the number of people who have the same religion and dividing that by the total population of that location.

I also learn how to recode data in Stata. The religious identity variable in the GSS data set has a wide variety of religions: here, like Gruber, I plan to recode the religions into categories.

In my initial meeting with Professor Parman, we talk about weighting the data. I also learn more about merging data sets and doing a repeated cross section data analysis.

My plan for the next week is to continue to work with the GSS and IPUMS data, figure out how to create the religious density variable, and then plug it into some models.