Do poor countries grow faster than rich countries?¶
Dataset¶
The dataset contains growth data of Barro-Lee. The Barro Lee data consists of a panel of 138 countries for the period 1960 to 1985. The dependent variable is national growth rates in GDP per capita for the periods 1965-1975 and 1975-1985. The growth rate in GDP over a period from t1 to t2 is commonly defined as log(GDPt1 /GDPt2 ). The number of variables are p=62. The number of complete observations is n=90.
The full data set and further details can be found at http://www.nber.org/pub/barro.lee, http: //www.barrolee.com, and, http://www.bristol.ac.uk//Depts//Economics//Growth//barlee.htm.
- Outcome : national growth rates in GDP per capita for the periods 1965-1975.
- intercept: Constant.
- gdpsh465 : Real GDP per capita (1980 international prices) in 1965
- bmp1l : Black market premium. Log (1+BMP)
- freeop : Measure of "Free trade openness
- freetar : Measure of tariff restriction
- h65 : Total gross enrollment ratio for higher education in 1965.
- hm65 : Male gross enrollment ratio for higher education in 1965.
- hf65 : Female gross enrollment ratio for higher education in 1965.
- p65 : Total gross enrollment ratio for primary education in 1965.
- pm65 : Male gross enrollment ratio for primary education in 1965.
- pf65 : Female gross enrollment ratio for primary education in 1965.
- s65 : Total gross enrollment ratio for secondary education in 1965.
- sm65 : Male gross enrollment ratio for secondary education in 1965.
- sf65 : Female gross enrollment ratio for secondary education in 1965.
- fert65 : Total fertility rate (children per woman) in 1965.
- mort65 : Infant Mortality Rate in 1965.
- lifee065 : Life expectancy at age 0 in 1965.
- gpop1 : Growth rate of population.
- fert1 : Total fertility rate (children per woman).
- mort1 : Infant Mortality Rate (ages 0-1).
- invsh41 : Ratio of real domestic investment (private plus public) to real GDP.
- geetot1 : Ratio of total nominal government expenditure on education to nominal GDP.
- geerec1 : Ratio of recurring nominal government expenditure on education to nominal GDP.
- gde1 : Ratio of nominal government expenditure on defense to nominal GDP.
- govwb1 : Ratio of nominal government "consumption" expenditure to nominal GDP (using current local currency).
- govsh41 : Ratio of real government "consumption" expenditure to real GDP. (Period average).
- gvxdxe41 : Ratio of real government "consumption" expenditure net of spending on defense and on education to real GDP.
- high65 : Percentage of "higher school attained" in the total pop in 1965.
- highm65 : Percentage of "higher school attained" in the male pop in 1965.
- highf65 : Percentage of "higher school attained" in the female pop in 1965.
- highc65 : Percentage of "higher school complete" in the total pop.
- highcm65 : Percentage of "higher school complete" in the male pop.
- highcf65 : Percentage of "higher school complete" in the female pop.
- human65 : Average schooling years in the total population over age 25 in 1965.
- humanm65 : Average schooling years in the male population over age 25 in 1965.
- humanf65 : Average schooling years in the female population over age 25 in 1965.
- hyr65 : Average years of higher schooling in the total population over age 25.
- hyrm65 : Average years of higher schooling in the male population over age 25.
- hyrf65 : Average years of higher schooling in the female population over age 25.
- no65 : Percentage of "no schooling" in the total population.
- nom65 : Percentage of "no schooling" in the male population.
- nof65 : Percentage of "no schooling" in the female population.
- pinstab1 : Measure of political instability.
- pop65 : Total Population in 1965.
- worker65 : Ratio of total Workers to population.
- pop1565 : Population Proportion under 15 in 1965.
- pop6565 : Population Proportion over 65 in 1965.
- sec65 : Percentage of "secondary school attained" in the total pop in 1965.
- secm65 : Percentage of "secondary school attained" in male total pop in 1965.
- secf65 : Percentage of "secondary school attained" in female total pop in 1965.
- secc65 : Percentage of "secondary school complete" in the total pop in 1965.
- seccm65 : Percentage of "secondary school complete" in the total pop in 1965.
- seccf65 : Percentage of "secondary school complete" in female pop in 1965.
- syr65 : Average years of secondary schooling in the total population over age 25 in 1965.
- syrm65 : Average years of secondary schooling in the male population over age 25 in 1965.
- syrf65 : Average years of secondary schooling in the female population over age 25 in 1965.
- teapri65 : Pupil/Teacher Ratio in primary school.
- teasex65 : Pupil/Teacher Ratio in secondary school
- ex1 : Ratio of export to GDP (in current international prices)
- im1 : Ratio of import to GDP (in current international prices)
- xr65 : Exchange rate (domestic currency per U.S. dollar) in 1965.
- tot1 : Terms of trade shock (growth rate of export prices minus growth rate of import prices).
Importing the necessary libraries and overview of the dataset¶
#Import required libraries
import numpy as np
import pandas as pd
import warnings
warnings.filterwarnings("ignore")
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
# Load data
df = pd.read_csv('growth.csv')
# See variables in the dataset
df.head()
Outcome | intercept | gdpsh465 | bmp1l | freeop | freetar | h65 | hm65 | hf65 | p65 | ... | seccf65 | syr65 | syrm65 | syrf65 | teapri65 | teasec65 | ex1 | im1 | xr65 | tot1 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | -0.024336 | 1 | 6.591674 | 0.2837 | 0.153491 | 0.043888 | 0.007 | 0.013 | 0.001 | 0.29 | ... | 0.04 | 0.033 | 0.057 | 0.010 | 47.6 | 17.3 | 0.0729 | 0.0667 | 0.348 | -0.014727 |
1 | 0.100473 | 1 | 6.829794 | 0.6141 | 0.313509 | 0.061827 | 0.019 | 0.032 | 0.007 | 0.91 | ... | 0.64 | 0.173 | 0.274 | 0.067 | 57.1 | 18.0 | 0.0940 | 0.1438 | 0.525 | 0.005750 |
2 | 0.067051 | 1 | 8.895082 | 0.0000 | 0.204244 | 0.009186 | 0.260 | 0.325 | 0.201 | 1.00 | ... | 18.14 | 2.573 | 2.478 | 2.667 | 26.5 | 20.7 | 0.1741 | 0.1750 | 1.082 | -0.010040 |
3 | 0.064089 | 1 | 7.565275 | 0.1997 | 0.248714 | 0.036270 | 0.061 | 0.070 | 0.051 | 1.00 | ... | 2.63 | 0.438 | 0.453 | 0.424 | 27.8 | 22.7 | 0.1265 | 0.1496 | 6.625 | -0.002195 |
4 | 0.027930 | 1 | 7.162397 | 0.1740 | 0.299252 | 0.037367 | 0.017 | 0.027 | 0.007 | 0.82 | ... | 2.11 | 0.257 | 0.287 | 0.229 | 34.5 | 17.6 | 0.1211 | 0.1308 | 2.500 | 0.003283 |
5 rows × 63 columns
# Dimensions of the dataset
df.shape
(90, 63)
- In this segment, we provide an empirical example of using partialling-out with Lasso to estimate the regression coefficient $β_1$ in the high-dimensional linear regression model. For any inference question, we can write Y as
Y = $β_1D$ + $β^r_2W$ + ε.
Specifically we are interested in, how the rates at which economies of different countries grow denoted by Y , are related to the initial wealth levels in each country, denoted by D, controlling for country’s institutional, educational, and other similar characteristics, denoted by W.
The relationship is captured by the regression coefficient $β_1$.
In this example, this coefficient is called the “speed of convergence/divergence”, as it measures the speed at which poor countries catch up ($β_1$<0) or fall ($β_1$>0) behind wealthy countries, controlling for W.
Our inference question here is: Do poor countries grow faster than rich countries, controlling for educational and other characteristics? In other words, is the speed of convergence negative: $β_1$<0?
This is the Convergence Hypothesis predicted by the Solow Growth Model.Robert M. Solow is a world-renowned MIT economist who won the Nobel Prize in Economics.
The dataset contains 90 countries and about 60 controls. Thus p is approximately 60 and n = 90 and p over n is not small. This means that we operate in the high-dimensional setting.
Therefore, we expect the least squares method to provide a poor, very noisy estimate of $β_1$.
In contrast, we expect the method based on partialling-out with Lasso to provide a high quality estimate of $β_1$.
# Extract the names of control and treatment variables from varnames
xnames = df.columns[3:] # names of X variables
dandxnames = df.columns[2:] # names of D and X variables
print(xnames)
print(dandxnames)
Index(['bmp1l', 'freeop', 'freetar', 'h65', 'hm65', 'hf65', 'p65', 'pm65', 'pf65', 's65', 'sm65', 'sf65', 'fert65', 'mort65', 'lifee065', 'gpop1', 'fert1', 'mort1', 'invsh41', 'geetot1', 'geerec1', 'gde1', 'govwb1', 'govsh41', 'gvxdxe41', 'high65', 'highm65', 'highf65', 'highc65', 'highcm65', 'highcf65', 'human65', 'humanm65', 'humanf65', 'hyr65', 'hyrm65', 'hyrf65', 'no65', 'nom65', 'nof65', 'pinstab1', 'pop65', 'worker65', 'pop1565', 'pop6565', 'sec65', 'secm65', 'secf65', 'secc65', 'seccm65', 'seccf65', 'syr65', 'syrm65', 'syrf65', 'teapri65', 'teasec65', 'ex1', 'im1', 'xr65', 'tot1'], dtype='object') Index(['gdpsh465', 'bmp1l', 'freeop', 'freetar', 'h65', 'hm65', 'hf65', 'p65', 'pm65', 'pf65', 's65', 'sm65', 'sf65', 'fert65', 'mort65', 'lifee065', 'gpop1', 'fert1', 'mort1', 'invsh41', 'geetot1', 'geerec1', 'gde1', 'govwb1', 'govsh41', 'gvxdxe41', 'high65', 'highm65', 'highf65', 'highc65', 'highcm65', 'highcf65', 'human65', 'humanm65', 'humanf65', 'hyr65', 'hyrm65', 'hyrf65', 'no65', 'nom65', 'nof65', 'pinstab1', 'pop65', 'worker65', 'pop1565', 'pop6565', 'sec65', 'secm65', 'secf65', 'secc65', 'seccm65', 'seccf65', 'syr65', 'syrm65', 'syrf65', 'teapri65', 'teasec65', 'ex1', 'im1', 'xr65', 'tot1'], dtype='object')
Barro-Lee GrowthData¶
The outcome (Y) is the realized annual growth rate of a country's wealth (Gross Domestic Product per capita).
The target regressor (D) is the initial level of the country's wealth.
The target parameter $β_1$ is the speed of convergence, which measures the speed at which poor countries catch up with rich countries.
The controls (W) include measures of education levels, quality of institutions, trade openness, and political stability in the country.
from Scikit-learn.linear_model import LinearRegression #import linear regression from Scikit-learn
from Scikit-learn import metrics
from statsmodels.regression.linear_model import OLS #import Ordinary Least Squares from statsmodels
import statsmodels.api as sm
Y = df['Outcome'] # target variable
D = df['gdpsh465'] # target regressors
D_X = df[dandxnames] # control variables with target regressor
X = df[xnames] # control variables
D_X = sm.add_constant(D_X) # adding constants for intercept to control variables with target regressor
X = sm.add_constant(X) # adding constants for intercept to control variables
model = sm.OLS(Y, D_X) # OLS model object
results = model.fit() # training the model
results.summary() # summary of the trained model
Dep. Variable: | Outcome | R-squared: | 0.887 |
---|---|---|---|
Model: | OLS | Adj. R-squared: | 0.641 |
Method: | Least Squares | F-statistic: | 3.607 |
Date: | Sat, 07 May 2022 | Prob (F-statistic): | 0.000200 |
Time: | 08:45:19 | Log-Likelihood: | 238.24 |
No. Observations: | 90 | AIC: | -352.5 |
Df Residuals: | 28 | BIC: | -197.5 |
Df Model: | 61 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
const | 0.2472 | 0.785 | 0.315 | 0.755 | -1.360 | 1.854 |
gdpsh465 | -0.0094 | 0.030 | -0.314 | 0.756 | -0.071 | 0.052 |
bmp1l | -0.0689 | 0.033 | -2.117 | 0.043 | -0.135 | -0.002 |
freeop | 0.0801 | 0.208 | 0.385 | 0.703 | -0.346 | 0.506 |
freetar | -0.4890 | 0.418 | -1.169 | 0.252 | -1.346 | 0.368 |
h65 | -2.3621 | 0.857 | -2.755 | 0.010 | -4.118 | -0.606 |
hm65 | 0.7071 | 0.523 | 1.352 | 0.187 | -0.364 | 1.779 |
hf65 | 1.6934 | 0.503 | 3.365 | 0.002 | 0.663 | 2.724 |
p65 | 0.2655 | 0.164 | 1.616 | 0.117 | -0.071 | 0.602 |
pm65 | 0.1370 | 0.151 | 0.906 | 0.373 | -0.173 | 0.447 |
pf65 | -0.3313 | 0.165 | -2.006 | 0.055 | -0.670 | 0.007 |
s65 | 0.0391 | 0.186 | 0.211 | 0.835 | -0.341 | 0.419 |
sm65 | -0.0307 | 0.117 | -0.263 | 0.795 | -0.270 | 0.209 |
sf65 | -0.1799 | 0.118 | -1.523 | 0.139 | -0.422 | 0.062 |
fert65 | 0.0069 | 0.027 | 0.254 | 0.801 | -0.049 | 0.062 |
mort65 | -0.2335 | 0.817 | -0.286 | 0.777 | -1.908 | 1.441 |
lifee065 | -0.0149 | 0.193 | -0.077 | 0.939 | -0.411 | 0.381 |
gpop1 | 0.9702 | 1.812 | 0.535 | 0.597 | -2.742 | 4.682 |
fert1 | 0.0088 | 0.035 | 0.252 | 0.803 | -0.063 | 0.081 |
mort1 | 0.0666 | 0.685 | 0.097 | 0.923 | -1.336 | 1.469 |
invsh41 | 0.0745 | 0.108 | 0.687 | 0.498 | -0.148 | 0.297 |
geetot1 | -0.7151 | 1.680 | -0.426 | 0.674 | -4.157 | 2.726 |
geerec1 | 0.6300 | 2.447 | 0.257 | 0.799 | -4.383 | 5.643 |
gde1 | -0.4436 | 1.671 | -0.265 | 0.793 | -3.867 | 2.980 |
govwb1 | 0.3375 | 0.438 | 0.770 | 0.447 | -0.560 | 1.235 |
govsh41 | 0.4632 | 1.925 | 0.241 | 0.812 | -3.481 | 4.407 |
gvxdxe41 | -0.7934 | 2.059 | -0.385 | 0.703 | -5.012 | 3.425 |
high65 | -0.7525 | 0.906 | -0.831 | 0.413 | -2.608 | 1.103 |
highm65 | -0.3903 | 0.681 | -0.573 | 0.571 | -1.786 | 1.005 |
highf65 | -0.4177 | 0.561 | -0.744 | 0.463 | -1.568 | 0.732 |
highc65 | -2.2158 | 1.481 | -1.496 | 0.146 | -5.249 | 0.818 |
highcm65 | 0.2797 | 0.658 | 0.425 | 0.674 | -1.069 | 1.628 |
highcf65 | 0.3921 | 0.766 | 0.512 | 0.613 | -1.177 | 1.961 |
human65 | 2.3373 | 3.307 | 0.707 | 0.486 | -4.437 | 9.112 |
humanm65 | -1.2092 | 1.619 | -0.747 | 0.461 | -4.525 | 2.106 |
humanf65 | -1.1039 | 1.685 | -0.655 | 0.518 | -4.555 | 2.347 |
hyr65 | 54.9139 | 23.887 | 2.299 | 0.029 | 5.983 | 103.845 |
hyrm65 | 12.9350 | 23.171 | 0.558 | 0.581 | -34.529 | 60.400 |
hyrf65 | 9.0926 | 17.670 | 0.515 | 0.611 | -27.102 | 45.287 |
no65 | 0.0372 | 0.132 | 0.282 | 0.780 | -0.233 | 0.308 |
nom65 | -0.0212 | 0.065 | -0.326 | 0.747 | -0.154 | 0.112 |
nof65 | -0.0169 | 0.067 | -0.252 | 0.803 | -0.154 | 0.120 |
pinstab1 | -0.0500 | 0.031 | -1.616 | 0.117 | -0.113 | 0.013 |
pop65 | 1.032e-07 | 1.32e-07 | 0.783 | 0.440 | -1.67e-07 | 3.73e-07 |
worker65 | 0.0341 | 0.156 | 0.218 | 0.829 | -0.286 | 0.354 |
pop1565 | -0.4655 | 0.471 | -0.988 | 0.332 | -1.431 | 0.500 |
pop6565 | -1.3575 | 0.635 | -2.138 | 0.041 | -2.658 | -0.057 |
sec65 | -0.0109 | 0.308 | -0.035 | 0.972 | -0.641 | 0.619 |
secm65 | 0.0033 | 0.151 | 0.022 | 0.983 | -0.306 | 0.313 |
secf65 | -0.0023 | 0.158 | -0.015 | 0.988 | -0.326 | 0.321 |
secc65 | -0.4915 | 0.729 | -0.674 | 0.506 | -1.985 | 1.002 |
seccm65 | 0.2596 | 0.356 | 0.730 | 0.471 | -0.469 | 0.988 |
seccf65 | 0.2207 | 0.373 | 0.591 | 0.559 | -0.544 | 0.985 |
syr65 | -0.7556 | 7.977 | -0.095 | 0.925 | -17.095 | 15.584 |
syrm65 | 0.3109 | 3.897 | 0.080 | 0.937 | -7.671 | 8.293 |
syrf65 | 0.7593 | 4.111 | 0.185 | 0.855 | -7.661 | 9.180 |
teapri65 | 3.955e-05 | 0.001 | 0.051 | 0.959 | -0.002 | 0.002 |
teasec65 | 0.0002 | 0.001 | 0.213 | 0.833 | -0.002 | 0.003 |
ex1 | -0.5804 | 0.242 | -2.400 | 0.023 | -1.076 | -0.085 |
im1 | 0.5914 | 0.250 | 2.363 | 0.025 | 0.079 | 1.104 |
xr65 | -0.0001 | 5.42e-05 | -1.916 | 0.066 | -0.000 | 7.18e-06 |
tot1 | -0.1279 | 0.113 | -1.136 | 0.266 | -0.359 | 0.103 |
Omnibus: | 0.439 | Durbin-Watson: | 1.982 |
---|---|---|---|
Prob(Omnibus): | 0.803 | Jarque-Bera (JB): | 0.417 |
Skew: | 0.158 | Prob(JB): | 0.812 |
Kurtosis: | 2.896 | Cond. No. | 7.52e+08 |
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 7.52e+08. This might indicate that there are
strong multicollinearity or other numerical problems.
- Model performance is good as the r2 is coming to be 88.7 percentage.
- Low value to adjusted r2 makes sense as we have a lot of features and they may be capturing the noise in the dataset.
# Lasso model with target and control variables
model = sm.OLS(Y, X) # OLS model object
results = model.fit() # training the model
results_Y = model.fit_regularized(alpha = 0.002, L1_wt=1.0,start_params=results.params)
# alpha = learning parameter, L1_wt = 1 - represents lasso
final = sm.regression.linear_model.OLSResults(model,
results_Y.params,
model.normalized_cov_params)
r_Y = final.resid # find the residuals
# Lasso model with target regressor and control variables
model = sm.OLS(D, X)
results = model.fit()
results_D = model.fit_regularized(alpha = 0.002, L1_wt=1.0,start_params=results.params)
final = sm.regression.linear_model.OLSResults(model,
results_D.params,
model.normalized_cov_params)
r_D = final.resid # find the residuals
# Linear model between the residuals of the lasso models created above
Y = r_Y # target variable
X = r_D
X = sm.add_constant(X)
model = sm.OLS(Y, X)
results = model.fit() # train the model
print(results.summary())
OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.045 Model: OLS Adj. R-squared: 0.034 Method: Least Squares F-statistic: 4.143 Date: Sat, 07 May 2022 Prob (F-statistic): 0.0448 Time: 08:45:20 Log-Likelihood: 128.89 No. Observations: 90 AIC: -253.8 Df Residuals: 88 BIC: -248.8 Df Model: 1 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [0.025 0.975] ------------------------------------------------------------------------------ const 0.0011 0.006 0.177 0.860 -0.011 0.013 x1 -0.0431 0.021 -2.035 0.045 -0.085 -0.001 ============================================================================== Omnibus: 2.328 Durbin-Watson: 1.681 Prob(Omnibus): 0.312 Jarque-Bera (JB): 2.020 Skew: 0.250 Prob(JB): 0.364 Kurtosis: 2.463 Cond. No. 3.44 ============================================================================== Notes: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
Estimate | Standard Error | 95% Confidence Interval | |
---|---|---|---|
Least squares | -0.0094 | 0.030 | [-0.071 0.052] |
Partialling-out via lasso | -0.0431 | 0.021 | [-0.085 -0.001] |
As expected, least squares provides a rather noisy estimate of the speed of convergence, and does not allow us to answer the question about the convergence hypothesis.
In sharp contrast, partialling-out via Lasso provides a more precise estimate.
The lasso based point estimate is -0.043 and the 95% confidence interval for the (annual) rate of convergence is -0.085 to -0.001
This empirical evidence does support the convergence hypothesis.
Conclusions¶
In this segment, we have examined an empirical example in the high-dimensional setting.
Least squares yields a very noisy estimate of the target regression coefficient and does not allow us to answer an important empirical question.
Lasso does yield a precise estimate of the regression coefficient and does allow us to answer that question.
We have found significant empirical evidence supporting the convergence hypothesis of Solow.
# Convert notebook to html
!jupyter nbconvert --to html "/content/drive/My Drive/Colab Notebooks/Copy of FDS_Project_LearnerNotebook_FullCode.ipynb"