Chinese Development in Africa: Post Three

For this post, I would like to discuss the process I went through to put together my data and what I was hoping to do had there been time. I had downloaded DHS data recoded for births. This data has separate survey responses from the women interviewed for every child ever born to them. I was hoping to have my outcomes of interest be whether or not the child died and at what age they died. This was to be my measure of health in these countries seeing as reducing child mortality is a major goal of health organizations and governments the world over. The determinants were to be the number of Chinese-funded development projects in the region that the survey cluster was in, the amount of money going into those projects, and a host of variables about the mother and child (such as the age and level of education of the mother, and birth order and sex of the child), and dummies for the regions and the countries. I expected to find that regions in which there were more Chinese-funded development projects and regions that received more funds would show lower rates of child mortality and children not dying as young.

To do this, I used ArcGIS to do spatial joins that allowed me to give the regions extra variables to show the number of projects in each region and the amount of money going to each region. I downloaded the region shapefiles from the DHS website and upon doing the join, I merged all of these shapefiles into one and exported it as a table. Then I used Stata to append all of the separate DHS survey results into one large file. Then I merged the table I exported from ArcGIS with this appended DHS file in a many to one merge. This process sounds simple enough, but it was painstaking given the literally millions of lines of responses, the finnickyness (not really a word) of these programs, and the many minor problems that always arise from data management.