# ECO 440/640 — Problem Set 4

Pay attention to this paragraph, as it illustrates for you how to talk about a data set and define variables. This problem set uses data on the Agora marketplace collected by your instructor in December 2014. Agora is a Darknet market where people typically buy drugs and other illicit materials. Each observation in the data set is information on one offer of a drug for sale (similar to an eBay listing). I have limited the sample to a randomly selected sample of listings on two days and to only “weed” versions of Cannabis for ease of exposition. The data set contains the following variables.

 PriceBTC Price in Bitcoin Grams Grams of weed offered Type String information about strain or quality Seller User name of the seller Origin Country from which the drugs would be shipped OriginIsUSA Recode of Origin: 1 if Origin is USA and 0 otherwise To Places to which seller will ship Shipping Information in the listing about shipping options and prices (a number means dollars unless otherwise specified) FeedbackCount Number of items of feedback the seller has received on the site MostRecentFeedback_Days Number of days since the seller last received feedback OldestFeedback_Days Number of days since the seller first received feedback Score Average feedback score for the seller (feedback scores can be 1, 2, 3, 4, or 5) NumberOfDeals String with bins showing the number of sales the seller has made on the site NumberOfDeals_Continuous Recode of NumberOfDeals into a continuous numeric variable with the midpoint of the range in NumberOfDeals (1000+ was coded to 1000) Date Date the listing was observed Date Time the listing was observed PricePerGram PriceBTC/Grams DollarsPerBTC Exchange rate (USD per BTC) on the day the listing was observed PricePerGramDollars PricePerGram×DollarsPerBTC URL URL of the listing (at the time it was observed)

## Part A

Table 1 is some R output (not “RStudio output”) that is missing some values. However, you have enough information in the table to fill in all the missing information, so do that. Include significance stars where appropriate. Explain how you get the values. The regression is similar to the ones in part A but uses separate indicators for multiple country origins (instead of just the US).

 Estimate Std. Error t value Pr(>|t|) Call: lm(formula = PricePerGramDollars ~ log2(Grams) + Score + FeedbackCount + factor(Origin) + NumberOfDeals_Continuous, data = agora) Coefficients: (Intercept) 4.6305802 24.2787089 0.191 0.84946 log2(Grams) -1.1909527 0.1349346 Score 2.8133501 0.573 0.56929 FeedbackCount 0.0476203 -0.858 0.39457 factor(Origin)Canada -3.8790813 3.2676473 -1.187 0.24038 factor(Origin)Germany 2.9483056 1.6652960 1.770 0.08230 . factor(Origin)USA -3.7883966 1.1314708 -3.348 NumberOfDeals_Continuous -0.0004706 0.38933 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2.273 on 54 degrees of freedom Multiple R-squared: 0.7267, Adjusted R-squared: 0.6913 F-statistic: 20.52 on 7 and 54 DF, p-value: 3.802e-13

## Part B

Table 2 contains estimates of the economies of scale in marijuana purchases with data from Agora auctions. All of the subsequent questions refer to Table 2.
 Dependent variable: Price Per Gram ($) Log of Price Per Gram ($) (1) (2) (3) (4) Constant 15.420*** -18.673 2.776*** 0.394 (0.749) (18.988) (0.073) (1.960) Log (base 2) of grams -1.296*** -1.199*** (0.152) (0.139) Log of grams -0.191*** -0.177*** (0.021) (0.021) Seller Rating (out of 5) 7.685* 0.545 (3.891) (0.402) Feedback count -0.049 -0.003 (0.049) (0.005) Origin is USA -4.594*** -0.401*** (0.934) (0.096) Number of deals by seller -0.0004 -0.00000 (0.001) (0.0001) Observations 62 62 62 62 R2 0.547 0.698 0.572 0.679 Adjusted R2 0.540 0.671 0.565 0.650 Residual Std. Error 2.776 2.346 0.270 0.242 Note: *p<0.1; **p<0.05; ***p<0.01

1. For each of the estimated regressions, write out the implied regression model.
2. Interpret the coefficient on Grams in each regression (they are not all the same). Use plain language (there should be nothing in your interpretation like “the log of grams increases by...”).
3. Interpret the coefficient on OriginIsUSA. Use plain language.
4. How would we interpret that intercept in regression 2? Does it make sense? What should we do about it?
5. Interpret the p-value on the log of Grams in regression 4 and explain what you would do with that p-value.
6. Construct a 95% confidence interval and a 99% confidence interval (using the correct degrees of freedom) for the coefficient on the log of Grams in regression 4. Interpret one of these confidence intervals.