Resources

Things I share with students and collaborators: how to find a predoc, AI tools for research, useful datasets, and research tools and guides.

Predoc Tips

Sources for finding predoctoral and RA positions

This list is not exhaustive, but it’s a good place to start.

A hub of predoc resources. Predoc.org collects useful materials, including practice coding tasks and a newsletter with advertised positions.

Sites that aggregate many postings

  • NBER posts non-NBER RA positions here, and advertises some directly here.
  • Professors and others tag relevant RA postings on the @econ_RA Twitter/X page (no account needed to read).

Organized predoc programs with a set hiring cycle

Programs for US citizens and/or permanent residents only

Other sources to check

Guides

AI Tools

I ran a hands-on workshop for economists at UCSD on Claude Cowork + Code. The session covered how these tools fit into an economics research workflow: reading and editing LaTeX drafts, writing and running Stata, R, and Python code, and working with project files, with live examples. Slides and the workshop thread:

Claude Cowork for economists: workshop

Workshop slides (PDF) · workshop thread on X

Data

Curated datasets I point students to.

Geospatial data for development & urban research

Publicly available spatial and satellite data for development and urban research:

  • GHS - Global Human Settlement Layer (JRC / Copernicus). Global gridded data on the human presence, 1975 onward.
    • GHS-BUILT (built-up surface), GHS-POP (population grids), GHS-SMOD (settlement model / degree of urbanization), and GHS-UCDB (Urban Centre Database of consistently delineated cities).
  • DHS - Demographic and Health Surveys. Household survey microdata across many developing countries; the geospatial program adds (randomly displaced) GPS cluster coordinates and covariates such as night lights. Confidential GPS data needs a short access request.
  • Night lights. VIIRS and DMSP nighttime-lights rasters, a standard proxy for local economic activity (see Matt Lowe’s night-lights and ArcGIS guide).
  • GADM. Administrative boundaries worldwide, country down to level 2-3.
  • IPUMS International. Harmonized (consistent-over-time GEOLEV1 / GEOLEV2) and unharmonized (year-specific) administrative boundaries linked to census microdata; see the GIS boundary files. IPUMS USA does the same for the US.
  • Aggregated catalogs. The UPenn Libraries GIS guide (global and US spatial data) and the geo4.dev data catalog (development-focused) list many more sources.
Mental health & Sleep data

Publicly available datasets with a validated mental-health or wellbeing measure, a few key economics papers with open replication data, speech/audio depression corpora, and sleep data (self-reported and objective actigraphy / lab). Most datasets are free with registration; restricted or paid ones are flagged. The MH / sleep measures column lists each dataset’s mental-health instruments and what sleep data it collects, if any (checked in the questionnaires / codebooks). The Youth available? column notes whether adolescents / young adults are covered: by age (a general sample you can filter), by student status, or as a youth-only sample.

Development and low- or middle-income country panels

DatasetMH / sleep measuresGeographic coverageLevelTotal NYouth available?Access
IFLS - Indonesia Family Life SurveyCES-D-10
Sleep: PROMIS quality & disturbance items (wave 5)
Indonesia, 13 provinces; province-level (GPS restricted)Individual & household~30,000 individuals / 7,200+ householdsYes, age 15+Free, registration
MxFLS - Mexican Family Life SurveyZung/Calderon depression
Sleep: daily hours (time-use, all waves)
Mexico, national; state & municipalityIndividual & household~35,000 individuals / 8,400 householdsYes, age 15+Free, registration
Young LivesSRQ-20, GAD-7/PHQ-8, Cantril
Sleep: time-use hours/day (rounds 2-5, 7)
Ethiopia, India, Peru, Vietnam; region/districtIndividual (child cohort)~12,000 childrenYouth cohort (child to young adult)Free, registration (UK Data Service)
NIDS - National Income Dynamics StudyCES-D-10
Sleep: only the CES-D restless-sleep item
South Africa, national; district municipalityIndividual~28,000 individuals / 7,300 householdsYes, age 15+Free, registration
CFPS - China Family Panel StudiesCES-D, Kessler K6
Sleep: hours (2014+), bedtime & naps (all waves)
China, 25 provinces (~95% of pop.)Individual & household~42,600 individuals / 14,960 householdsYes, age 10+Free, registration + data-use agreement
KLPS - Kenya Life Panel SurveyCES-D-10 (KLPS-4)
Sleep: bed/wake times, quality, naps (KLPS-4)
Kenya, Busia County cohort (followed nationwide and abroad)Individual (+ 2nd-gen children)~7,500 cohort + ~5,200 childrenYes (young-adult cohort)Free, open (CC0)

United States, UK & Europe

DatasetMH / sleep measuresGeographic coverageLevelTotal NYouth available?Access
Add HealthCES-D (modified)
Sleep: duration, timing & quality items (all waves)
US, national; geocodes restrictedIndividual~20,000 (in-home)Yes (adolescent cohort)Public-use free; full sample restricted
NSDUH - Nat. Survey on Drug Use & HealthKessler K6, MDE module
Sleep: MDE insomnia / hypersomnia items only
US, national + state (small-area est.)Individual~67,500 / yearYes, age 12+Free public-use files
NHANESPHQ-9
Sleep: SLQ items (2005+); wrist accelerometry 2011-14
US, national only in public fileIndividual~5,000 / yearYes (12-17 restricted)Free (18+); 12-17 file restricted
NCS-R / NCS-ACIDI diagnostic
Sleep: insomnia items; NCS-A adds bedtime & hours
US, nationalIndividual9,282 / 10,123Yes (NCS-A, 13-18)NCS-R free; NCS-A restricted
HRS - Health & Retirement StudyCES-D-8
Sleep: Jenkins insomnia items (2002+); time-use hours
US, nationalIndividual~20,000 / waveNo (50+)Free, registration
Healthy Minds StudyPHQ-9, GAD-7, flourishing
Sleep: duration items; ISI module (some waves)
US colleges; Census region only, individual colleges blindedIndividual (student)~935,000 (675+ colleges)Students only (college)Free, de-identified; short data-request form
Understanding Society (UKHLS)GHQ-12, SWEMWBS; youth SDQ
Sleep: PSQI-derived items (waves 1, 4, 7, 10, 13)
UK; region public, finer restrictedIndividual & household~40,000 households / ~100,000 individualsYes (youth panel 10-15)Free, registration (UK Data Service)
ELSA - English Longitudinal Study of AgeingCES-D-8
Sleep: items in waves 4/6/8; wrist actigraphy (wave 10)
England; region-level publicIndividual~11,400 (Wave 1 core)No (50+)Free, registration
SHAREEURO-D
Sleep: trouble-sleeping & medication items; hours (waves 8-9)
28 European countries + Israel; country-levelIndividual~160,000 respondentsNo (50+)Free (scientific use), registration
UK BiobankPHQ-9, GAD-7, CIDI-SF
Sleep: duration, chronotype & insomnia items; actigraphy (~103,000)
UK; location restricted (1 km grid)Individual~500,000No (40-69)Application + fee + agreement

Cross-national and global

DatasetMH / sleep measuresGeographic coverageLevelTotal NYouth available?Access
WHO World Mental HealthCIDI diagnostic
Sleep: insomnia items (chronic-conditions section)
28+ countries; country-levelIndividual>200,000 interviewsNo (adults)Restricted (consortium agreement)
HBSC - Health Behaviour in School-aged ChildrenPsychosomatic scale, Cantril
Sleep: sleep-onset difficulties (all rounds); bedtimes (optional)
45+ countries; country/regionIndividual (student)~220,000+ / roundStudents only (ages 11/13/15)Aggregate public; microdata by request (embargo)
Global Burden of DiseaseModeled prevalence & burden (not survey items)
Sleep: none
204 countries + some subnationalCountry-year (aggregate)Aggregate (not respondents)Yes (age bands, incl. 10-19)Free, registration
DHS - Demographic and Health SurveysPHQ-9 + GAD-7 module
Sleep: none
Overall 63 countries (displaced GPS clusters); MH module in a small but growing set (Nepal, Kenya, Bangladesh, and others), not most surveysIndividual & household~5,000-30,000 households / surveyYes, age 15-49Free, registration

Subjective wellbeing (life satisfaction and happiness, not clinical mental health)

DatasetMH / sleep measuresGeographic coverageLevelTotal NYouth available?Access
Gallup World PollCantril ladder, daily affect
Sleep: “well-rested yesterday” item only
160+ countries; country-levelIndividual~1,000 / country / yearYes, age 15+Paid microdata; some free aggregates
World Values SurveyLife satisfaction, happiness
Sleep: none
64 countries (Wave 7); country-levelIndividual~95,000 / waveNo (18+)Free, registration

Key mental-health economics papers (with public replication data)

PaperYearJournalPopulationInterventionReplicationMental-health variables
Haushofer & Shapiro2016QJEPoor households, KenyaRCT: unconditional cash transfersHarvard DataversePsychological-wellbeing index: CES-D, Cohen stress, WVS happiness & life satisfaction, salivary cortisol
Baranov, Bhalotra, Biroli & Maselko2020AERPerinatal mothers, rural PakistanRCT: perinatal CBT (Thinking Healthy)openICPSRSCID (major-depression diagnosis), Hamilton scale, disability, GAF, social support
Bessone, Rao, Schilbach, Schofield & Toma2021QJELow-income adults, Chennai (India)RCT: night-sleep devices / incentives; workplace napsHarvard DataversePsychological-wellbeing index: depression, stress, happiness, life satisfaction, Cantril ladder
Banerjee, Duflo, McKelway, Schilbach et al.2023Ann. Intern. Med.Elderly living alone, Tamil Nadu (India)RCT: phone-based CBT; one-time cash transferHarvard DataverseGeriatric Depression Scale, WHODAS, single-item loneliness
Angelucci & Bennett2024AERAdults with depression, Karnataka (India)RCT: antidepressant pharmacotherapy; livelihood supportopenICPSRPHQ-9 (screening + severity)

Detecting depression from speech / audio - public corpora, mostly from the speech-ML and clinical communities (I found no economics study that has released audio-based depression data):

  • DAIC-WOZ / E-DAIC (USC). Field-standard English corpus: clinical interviews, audio + transcripts, PHQ-8 labels (the AVEC benchmark). Free but restricted (signed application, institutional email).
  • Androids Corpus. Italian speech (reading + interview), clinician diagnoses; open direct download (academic terms).
  • EATD-Corpus. Chinese speech + text with SDS depression labels; open download.
  • MODMA (Lanzhou). Audio (+ EEG), clinical MDD diagnosis; free but account + agreement.

Sleep in economics field experiments - objective wearable/actigraphy sleep alongside self-report and economic outcomes:

StudyYearPopulationSleep measureData
Bessone, Rao, Schilbach, Schofield & Toma, QJE2021Low-income adults, Chennai (India)Actigraphy + self-reportHarvard Dataverse (public)
Giuntella, Saccardo & Sadoff, JPE (forthcoming)2025~1,150 US university studentsFitbit + self-reportNBER w32550; replication not yet public
Avery, Giuntella & Jiao, REStat2025US college studentsWearable + self-reportpaper; no public package located

Objective sleep-data repositories (polysomnography and actigraphy):

SourceSleep measureCoverageAccess
NSRR - National Sleep Research Resource (NHLBI)PSG + actigraphy + questionnaires13+ cohorts, 26,000+ people (SHHS, MESA, MrOS, CHAT, …)Free, per-dataset data-use agreement
UK Biobank accelerometer sub-studyWrist actigraphy (7-day); derived sleep duration/efficiency/timing~104,000 participantsApproved application + fee
NHANES accelerometry (2011-2014)Wrist accelerometry (minute-level)US, nationally representativeFully public, no application
PhysioNet (Sleep-EDF, MMASH, Apple-Watch+PSG)PSG and/or consumer wearable with PSG labelsSmall validation cohortsMostly open access

Research Tools

A US choropleth map made with Stata's spmap

Stata tools and guides

Useful Stata tools with self-help guides, including making maps (SPMAP / GRMAP) and sample do-files.

A historical map being digitized

Digitizing historical maps with QGIS and Python

A step-by-step guide to georeferencing and digitizing historical maps (co-authored with a lab student).

Map of India showing distance to the nearest urban area

SHRUG: open geospatial data for India

The Socioeconomic High-resolution Rural-Urban Geographic Platform: open data covering roughly 600,000 villages and 8,000 towns in India, from the Development Data Lab.