My (new) Github repositories contain some code I have written and data I have cleaned.

13

Jun 13

## Plot multiple kernel densities on one plot in Stata

If you want to compare kernel density estimates across years for a particular variable, putting each estimate on one graph will make it easy. The process is fairly straightforward in Stata (and even easier in Matlab…). First, we start with the simple ‘kdensity‘ command

`kdensity income if year == 1990`

Next, we append this command with the ‘addplot‘ function:

`kdensity income if year == 1990, addplot(kdensity income if year == 1991)`

and we can add even more with the ‘||’ syntax:

`kdensity income if year == 1990, addplot(kdensity income if year == 1991 || kdensity income if year == 1992)`

If we could use the ‘by’ option, this process would be much cleaner. Finally, we add a legend:

`kdensity income if year == 1990, addplot(kdensity income if year == 1991 || kdensity income if year == 1992) legend(ring(0) pos(2) label(1 "1990") label(2 "1991") label(3 "1992"))`

19

May 13

## LateX files for top finance journals

Richard Stanton provides an excellent source of tex files (styles, bst, etc.) for major finance journals.

- Journal of Finance style file
- Journal of Finance bibtex style file
- JFE style file
- JFE bibtex style file
- Review of Financial Studies (RFS) style file
- Review of Financial Studies (RFS) bibtex style file

Some notes:

- Inline \subsubsections for RFS can be done using

```
\usepackage{titlesec}
\titleformat{\subsection}[runin]
{\normalfont\large\bfseries}{\thesubsection}{1em}{}
```

04

Aug 12

## Stata latex summary tables: a hack

The Stata package “eststo” has a nice set of commands to create summary statistics tables. These commands can create “fragments” that you can dynamically add to tables in a main tex document. Unfortunately, the output of the following example starts with an ‘\hline’ and doesn’t play well with latex:

`eststo clear `

`eststo: estpost tabstat var1 var2 var3, listwise statistics(mean sd) columns(statistics) `

Saving to a tex file:

`esttab using "../writing/tables/summary.tex", main(mean) aux(sd) nostar unstack nonote nomtitle nonumber replace label fragment `

creates a tex file that starts with ‘hline.’ When combined with a tabular environment with the ‘\input’ command (here \input{tables/summary.tex}) you will get a “Misplaced \noalign” error. I found a partial solution online with some changes.

1. Add `\makeatletter`

followed by `\newcommand*\ExpandableInput[1]{\@@input#1 }`

to your header

2. Make sure the header also has `\makeatother`

as well

3. Replace the \input command with `\ExpandableInput{tables/summary.tex}`

`4. If you want to end the table with more ‘hline”s, put a line break after the above command ‘\\’`

“Simple” as that….

18

May 12

## Simple Bootstrap Sample in Matlab

Let x be your vector of data which has to be bootstrapped. For each instance of the loop write:

`x = x(floor(rows(x)*rand(rows(x),1))+1,:); % bootstrap observations`

Now we have a matrix with the same number of rows and columns, but with re-sampled data with replacement. A one line bootstrap! (Modified code from John Cochrane)

18

May 12

## Using Remote Files in Matlab

If you have an FTP server set-up somewhere with ample space and bandwidth, Matlab can store and retrieve its files and data remotely. Just use these simple commands to connect and disconnect:

`% File to connect to the server where my Matlab/data resides`

% connect to the db

f = ftp(‘yourdomain.com’, ‘user’, ‘password’);

% change the directory

cd(f, ‘matlab’);

% now change the directory that we want to download the directory

cd ‘/’;

% Download the directory

mget(f, ‘remote_dir’);

% *****************

% INSERT PROGRAM HERE

% *****************

% when done move a directory here

cd ..

% now move the directory back to the server

mput(f, ‘remote_dir’);

disp(‘Files have been put back on the server’);

% close the connect

close(f);

17

May 11

## Simple R functions to keep or remove data frame columns

This function removes columns from a data frame by name:

`removeCols <- function(data, cols){ return(data[,!names(data) %in% cols]) }`

This function keeps columns of a data frame by name:

`keepCols <- function(data, cols){`

`return(data[,names(data) %in% cols]) }`

or just one function

`colKeepRemove <- function(data, cols, remove=1){`

`if(remove == 1){ return(data[,!names(data) %in% cols]) }`

`else { return(data[,!names(data) %in% cols]) }}`

04

Jan 11

## Running sums in Stata

Perhaps it is bad that I didn’t know this before, but the following code for Stata would have saved a week off of my dissertation work. Suppose that you have data structured like so:

`firm_id,date,amount`

and you want to create a new variable that is the total amount as of each date for each firm. In Stata, you simply type:

`sort firm_id date`

`bysort firm_id: gen total_t = sum(amount)`

Note the use of ‘gen‘ rather than ‘*egen*.’ The ‘sum’ command differs by the type of generate command (i.e. gen or egen), so about 500 lines of loops written in Stata code could be condensed in a few lines. Stata needs to fix the ‘egen’ and ‘gen’ distinction or I need to port more of my projects to R.

04

Jan 11

## Latex regression and summary tables in Stata

A clean, well organized latex table is difficult to build. If you do a lot of analysis in Stata, there are several tools to output latex tables of your regressions or summary statistics. These packages do not always work perfectly with the standard options. Below I present two example code snippets to produce a latex table of a set of regressions that includes a IV estimator and a summary statistics table that compares two groups in a database. Each uses the eststo package.

**Regression Table with Multiple Equations and Stages**

Here I run a couple of limited dependent variable models and a two-stage bivariate probit with an IV. The output isn’t perfect, but it works for pre-submission distribution.

The output looks like this:

Latex regression output from eststo

**Summary statistics with a by variable**

Next, consider summarizing the characteristics of two groups in your data. For example, I want to compare the age, number of boards seats and other features of venture capital spinoff founders to everyone else.

The output will look like this:

The full details of the esttab, estpost and eststo have many more options and a lot of examples.