21st Century Researchers

Stata: Dealing with Date variables

Monday, May 21, 2012

Dealing with date variables in dataset requires additional steps. If not done correctly, your data format may not be consistent and you will get your value wrong. Here are some tips related to date variables.

Generating date variables

If you import your data from excel, your date variables may be treated as 1) string variables, or 2) integer (less likely). An example of date variable in string format is: 1 Jan 1960. If in integer format, it would be like 17869. These two formats can be easily converted into date format.

String format

If your value is like 1 Jan 1960, you can use the following code:

gen newvar = date(oldvar, "DMY”)

After running this line, you will see that your value would be like 0, or 1046, because these values are stored in integer format. The value represents the days before or after 1 Jan, 1960. Positive value means after 1 Jan, 1960 and negative value means prior to the date. This format is commonly used in different dataset. Don’t be suprised and we will convert it to date format later.

Integer format

If your values are in integer format, you need the following line to convert it to date format.

Simple version

format newvar %d

A bit complicated version:

format %tdnn/dd/CCYY newvar

You can choose either one.

Date comparisons

Once you convert to date format, it is very easy to compare two dates. If you want to see how many days between two date variables, simply use gen command and use one minus another.

Add or minus from a date

If you want to add, for example, 120 days to your date variable or minus 120 days, it is very similar to date comparisons. Simply generate another variable, which equals to your date variable add or minus 120 days.

Comparison with a specific date

If you want to compare a date varialbe with a specific date, such as 1 Jan 2000, you can use the following code:

gen newvar = (mdy(1,1,2000) - oldvar)

The code above will show you how many days before or after 1 Jan, 2000. If your date varialbe is birthday, you can divide the value with 365.25 and get how old your participants are on 1 Jan, 2000.

Further reading

Using dates in Stata http://www.ats.ucla.edu/stat/stata/modules/dates.htm

How to know if one journal is a SSCI or SCI journal?

Friday, April 27, 2012

SCI stands for science citation index, and SSCI stands for social science citation index. I personally do not think SCI and SSCI are very important in American academia; however, in some countries, such as Taiwan and China, SSCI and SCI are used to rank universities. In this situation, knowing your target journal is SCI or SSCI journal is critical.

A couple ways to achieve this goal.

1) Check out at SCI and SSCI official site (recommendation: 4 out of 5)

SCI: http://science.thomsonreuters.com/cgi-bin/jrnlst/jloptions.cgi?PC=D
SSCI: http://www.thomsonscientific.com/cgi-bin/jrnlst/jloptions.cgi?PC=J

Take SSCI site for example. If I want to know if Foreign Language Annals is a SSCI journal, simply typing foreign in the search box.

Hit search and you will see the results. Voila, it is a SSCI journal.

The second approach (recommendation: 5 out of 5) is to use the following site:

http://publik.tuwien.ac.at/info/sci_search.php?lang=2

This site is self-explanatory, and I do not see the need to do any screen capturing.

The third approach is to use Web of Science (recommendation: 3 out of 5), but it requires annual subscription. Please check your library and see if they subscribe it.

http://apps.isiknowledge.com/WOS_GeneralSearch_input.do?product=WOS&search_mode=GeneralSearch&preferencesSaved=

Select publication name and type foreign l*.

Click analyze result.

It will shoe fields that you can analyze. Since we are looking for the journal, please choose source title ad click analyze.

The fourth approach is to use Journal Citation Report, but this one also require library subscription (recommendation: 4 out of 5).

http://admin-apps.isiknowledge.com/JCR/JCR

Stata: Count groups by individuals

Wednesday, April 25, 2012

One friend asked me the following question:

How can I transform the following format into:

id level
1 A
1 A
1 B
2 A
2 B
3 B

this one?

id level #ofA #ofB
1 A 2 1
1 A 2 1
1 B 2 1
2 A 1 1
2 B 1 1
3 B 0 1

Well, I do not think there is one command for this task. This is not very difficult if there are only two groups to count.

The way I achieve this task is:

use "http://images.researcher20.com/stata_group/stata_group.dta", clear    
egen acount = group(level)     
gsort +id +acount     
by id: egen acount2 = count(acount) if acount==1     
bys id: replace acount2 = acount2[_n-1] if acount2==.     
replace acount2=0 if acount2==.     
bys id: egen bcount2 = count(acount) if acount==2     
gsort +id -level     
by id: replace bcount2 = bcount2[_n-1] if bcount2==.     
replace bcount2=0 if bcount2==.

Level variable is a string variable, so I use egen to get a group id. If you are interested in learning more details, you can check my previous post:Stata: Create id by group.

After creating a new group id, I sort id and level. How do I count how many As and Bs? I count # of As using egen, but you may notice that if the value is B, # of A would be missing. So my next step is to fill up this missing with the value of previous record. This is why I sorted data at the beginning.

If there is no A, then replace the value from missing to zero.

The way I count # of Bs is similar. The only difference is sorting.

You may be curious: how about if I have more than three groups? Well, my code only works for two groups, and I have not found a way to count three groups by individuals.

If you have tips or code to achieve the task, please let me know!

Update
2012.4.27:

One friend shared with me her code:


foreach i in A B C D E F G H I J K L N P Q R S T U V W X Y Z {
bys id: egen nof`i'=sum(level=="`i'")}

Stata: Create id by group

Sunday, April 22, 2012

When doing your data analysis, sometimes you will encounter the following situation: in your dataset, everyone has an unique id. However, their IDs are long and each participant has multiple record (or the dataset is in a long format).

To visualize your data, you need to create a new ID for each individual regardless of how many records each person has. For example, the first person has three records, and we would like to assign a new ID 1 for the first person, and the second person would be 2.

Though it sounds difficult and tedious, it is not difficult to do so.

egen id = group(oldid)

Just one line and your problem will be solved.

Reference: http://www.stata.com/support/faqs/data/group.html

Stata tutorial index

Saturday, May 14, 2011

This is an idex page for Stata tutorial I have written on this blog. I will update this index if I write more. This index also reminds me what I should have written.

Data management

Stata: How to deal with missing values?

Export

Stata: How to export descriptive statistics tables?

Stata: Output correlation table

Stata: Export OLS regression table to Word or Excel

Stata: Export Logistic Regression (Coefficient/Odds ratio) to Word or Excel

Graph

Stata: Draw regression lines across groups

Endnote: Using Filter to import from CSA ILLUMINA (EndNote X4 compatible)

If you are in psychology or education, you would probably use CSA Illumina or Psycinfo a great deal. Sometimes you need to save time by importing references from CSA Illumina or Psycinfo. It is important to note that the correct filter is required for everything to work properly.

How do we do that?

I used an article from CSA Illumina for example：Melby, J. N., Conger, R. D., Fang, S.-A., Wickrama, K. A. S., & Conger, K. J. (2008). Adolescent family experiences and educational attainment during early adulthood. Developmental Psychology, 44(6), 1519-1536.

After locating the file, you will see a “Save, Print, Email” option at the top. Click it.

Choose current view record and use full format without references. I personally do not like to import references from the article because these references may not meet your requirements
.
Then click “save”

Open this txt using endnote.exe, and EndNote will ask you to choose the right filter. We selected e-psyche (CSA).

The reference should then appear in your endnote library.

Packing list for conferences

Thursday, April 29, 2010

Packing

Passport or Photo ID (i.e., driver license)
Boarding pass
Camera (with charger and cable line)
Laptop (and charger)
Clothes, socks, and PJs
Umbrella
Mobile phone (and charger)
Toothbrush and toothpaste
Medicine (in case if you have a cold or stomach ache)
Lip balm
Dollar bills (in case you want to leave tips)
Business cards
Hotel and transportation information
Nail clippers (in case you travel for a long time)
Comb
Flash drive (in case you want to share your files or use other people’s computer)
Slippers
Perfume

Before you leave

Backup your files in your laptop
Delete sensitive data (in case you use restricted dataset or you have private files)
Set your alarm
Copy your passport and other official documents

Updates:
2010.5.11: added perfume, shave, and necktie.
2010.5.12: added backing up passport and documents