About

PyAnomaly is a comprehensive python library for asset pricing research with a focus on firm characteristic and factor generation. It covers the majority of the firm characteristics published in the literature and contains various analytic tools that are commonly used in asset pricing research, such as quantile portfolio construction, factor regression, and cross-sectional regression. The purpose of PyAnomaly is NOT to generate firm characteristics in a fixed manner. Rather, we aim to build a package that can serve as a standard library for asset pricing research and help reduce non-standard errors[5].

The current list of firm characteristics supported by PyAnomaly can be found in Coverage. PyAnomaly is a live project and we plan to add more firm characteristics and functionalities going forward. We also welcome contributions from other scholars.

PyAnomaly is very efficient, comprehensive, and flexible.

Efficiency

PyAnomaly can generate over 200 characteristics from 1950 in around one hour including the time to download data from WRDS. To achieve this, PyAnomaly utilizes numba, multiprocessing, and asyncio packages when possible, but not too heavily to maximize readability of the code.

Comprehensiveness

PyAnomaly supports over 200 firm characteristics published in the literature. It covers most characteristics in Green et al. (2017)[2] and Jensen et al. (2021)[4], except those that use IBES data. It also provides various tools for asset pricing research.

Flexibility

PyAnomaly adopts the object-oriented programming design philosophy and is easy to customize or add functionalities. This means users can easily change the definition of an existing characteristic, add a new characteristic, or change configurations to run the program. For instance, a user can choose whether to update annual accounting variables quarterly (using Compustat.fundq) or annually (using Compustat.funda), or whether to use the latest market equity or the year-end market equity, when generating firm characteristics.

Main Features

  • Efficient data download from WRDS using asynco.

  • Over 200 firm characteristics generation. You can choose which firm characteristics to generate.

  • Fama-French 3-factor and Hou-Xue-Zhang 4-factor portfolios.

  • Analytics

    • Cross-section regression

    • 1-D sort

    • 2-D sort

    • Rolling regression

    • Quantile portfolio

    • Long-short portfolio

    • Portfolio performance analysis

  • Data tools

    • Data filtering

    • Winsorizing

    • Trimming

    • Data population

Coverage

Markets

PyAnomaly currently supports analysis of the firms listed in the US stock market.

Firm Characteristics

The table below lists firm characteristics that are currently supported by PyAnomaly. The characteristics without a function are not yet available but may be added in the future. For a mapping between the functions and the firm characteristics in Chen and Zimmermann (2020)[1], Green et al. (2017)[2], Hou et al. (2020)[3], and Jensen et al. (2021)[4], refer to the mapping file,

Description

Author(s)

Year

Journal

Function

1

Effective Tax Rate

Abarbanell and Bushee

1998

AR

2

Gross margin growth to sales growth

Abarbanell and Bushee

1998

AR

dgp_dsale

3

Industry-adjusted change in capital investment

Abarbanell and Bushee

1998

AR

pchcapx_ia

4

Labor force efficiency

Abarbanell and Bushee

1998

AR

sale_emp_gr1

5

Sales growth to inventory growth

Abarbanell and Bushee

1998

AR

dsale_dinv

6

Sales growth to receivable growth

Abarbanell and Bushee

1998

AR

dsale_drec

7

Sales growth to SG&A growth

Abarbanell and Bushee

1998

AR

dsale_dsga

8

Liquidity beta (illiquidity-illiquidity)

Acharya and Pedersen

2005

JFE

9

Liquidity beta (illiquidity-return)

Acharya and Pedersen

2005

JFE

10

Liquidity beta (return-illiquidity)

Acharya and Pedersen

2005

JFE

11

Liquidity beta (return-return)

Acharya and Pedersen

2005

JFE

12

Net liquidity beta

Acharya and Pedersen

2005

JFE

13

Leverage beta

Adrian, Etula and Muir

2014

JF

14

Idiosyncratic volatility (GHZ)

Ali, Hwang, and Trombley

2003

JFE

idiovol

15

Idiosyncratic volatility (Org, JKP)

Ali, Hwang, and Trombley

2003

JFE

ivol_capm_252d

16

Illiquidity

Amihud

2002

JFM

ami_126d

17

Bid-ask spread

Amihud and Mendelson

1986

JFE

baspread

18

Three-year investment growth

Anderson and Garcia-Feijoo

2006

JF

capx_gr3

19

Two-year investment growth

Anderson and Garcia-Feijoo

2006

JF

capx_gr2

20

Dispersion in analyst long-term growth forecasts

Anderson, Ghysels, and Juergens

2005

RFS

21

Idiosyncratic volatility (CAPM)

Ang et al.

2006

JF

ivol_capm_21d

22

Idiosyncratic volatility (FF3)

Ang et al.

2006

JF

ivol_ff3_21d

23

Idiosyncratic volatility (q-factor)

Ang et al.

2006

JF

ivol_hxz4_21d

24

Return volatility

Ang et al.

2006

JF

retvol

25

Systematic volatility

Ang et al.

2006

JF

26

Downside beta

Ang, Chen, and Xing

2006

RFS

betadown_252d

27

Industry-adjusted book-to-market

Asness, Porter, and Stevens

2000

WP

bm_ia

28

Industry-adjusted cash flow-to-price

Asness, Porter, and Stevens

2000

WP

cfp_ia

29

Industry-adjusted change in employees

Asness, Porter, and Stevens

2000

WP

chempia

30

Industry-adjusted firm size

Asness, Porter, and Stevens

2000

WP

mve_ia

31

Book-to-market (June ME)

Asnesss and Frazzini

2013

JPM

32

Highest 5 days of return to volatility

Assness et al.

2020

JFE

rmax5_rvol_21d

33

Market correlation

Assness et al.

2020

JFE

corr_1260d

34

Quality minus Junk: Composite

Assness, Frazzini, and Pedersen

2018

RAS

qmj

35

Quality minus Junk: Growth

Assness, Frazzini, and Pedersen

2018

RAS

qmj_growth

36

Quality minus Junk: Profitability

Assness, Frazzini, and Pedersen

2018

RAS

qmj_prof

37

Quality minus Junk: Safety

Assness, Frazzini, and Pedersen

2018

RAS

qmj_safety

38

Change in quarterly return on assets

Balakrishnan, Bartov, and Faurel

2010

JAE

niq_at_chg1

39

Change in quarterly return on equity

Balakrishnan, Bartov, and Faurel

2010

JAE

niq_be_chg1

40

Quarterly return on assets

Balakrishnan, Bartov, and Faurel

2010

JAE

niq_at

41

Highest 5 days of return

Bali, Brown, and Tang

2017

JFE

rmax5_21d

42

Maximum daily return

Bali, Cakici, and Whitelaw

2011

JFE

rmax1_21d

43

Idiosyncratic skewness (CAPM)

Bali, Engle, and Murray

2016

BOOK

iskew_capm_21d

44

Idiosyncratic skewness (FF3)

Bali, Engle, and Murray

2016

BOOK

iskew_ff3_21d

45

Idiosyncratic skewness (q-factor)

Bali, Engle, and Murray

2016

BOOK

iskew_hxz4_21d

46

Return skewness

Bali, Engle, and Murray

2016

BOOK

rskew_21d

47

Cash-based operating profitablility

Ball et al.

2016

JFE

cop_at

48

Cash-based operating profits to lagged assets

Ball et al.

2016

JFE

cop_atl1

49

Cash-based operating profits to lagged assets (quarterly)

Ball et al.

2016

JFE

50

Operating profits-to-assets

Ball et al.

2016

JFE

op_at

51

Operating profits-to-lagged assets

Ball et al.

2016

JFE

op_atl1

52

Operating profits-to-lagged assets (quarterly)

Ball et al.

2016

JFE

53

Absolute accruals

Bandyopadhyay, Huang, and Wirjanto

2010

WP

absacc

54

Accrual volatility

Bandyopadhyay, Huang, and Wirjanto

2010

WP

stdacc

55

Market equity

Banz

1981

JFE

market_equity

56

Sales to price

Barbee, Mukherji, and Raines

1996

FAJ

sale_me

57

Sales to price (quarterly)

Barbee, Mukherji, and Raines

1996

FAJ

58

Number of consecutive quarters with earnings increases

Barth, Elliott, and Finn

1999

JAR

ni_inc8q

59

Earnings to price

Basu

1983

JFE

ni_me

60

Earnings to price (quarterly)

Basu

1983

JFE

61

Forecasted growth in 5-year EPS

Bauman and Dowen

1988

FAJ

62

Inventory growth

Belo and Lin

2012

RFS

inv_gr1

63

Brand capital to assets

Belo, Lin and Vitorino

2014

RED

64

Employment growth

Belo, Lin, and Bazdresch

2014

JPE

emp_gr1

65

Debt to market

Bhandari

1988

JFE

debt_me

66

Debt to market (quarterly)

Bhandari

1988

JFE

67

12 month residual momentum

Blitz, Huij, and Martens

2011

JEF

resff3_12_1

68

6 month residual momentum

Blitz, Huij, and Martens

2011

JEF

resff3_6_1

69

Change in operating cash flow to assets

Bouchard et al.

2019

JF

ocf_at_chg1

70

Operating cash flow to assets

Bouchard et al.

2019

JF

ocf_at

71

Net payout yield

Boudoukh et al.

2007

JF

eqnpo_me

72

Net payout yield (quarterly)

Boudoukh et al.

2007

JF

73

Payout yield

Boudoukh et al.

2007

JF

eqpo_me

74

Payout yield (quarterly)

Boudoukh et al.

2007

JF

75

Net debt finance

Bradshaw, Richardson, and Sloan

2006

JAE

dbnetis_at

76

Net equity finance

Bradshaw, Richardson, and Sloan

2006

JAE

eqnetis_at

77

Net external finance

Bradshaw, Richardson, and Sloan

2006

JAE

netis_at

78

Dollar trading volume (JKP)

Brennan, Chordia, and Subrahmanyam

1998

JFE

dolvol_126d

79

Dollar trading volume (Org, GHZ)

Brennan, Chordia, and Subrahmanyam

1998

JFE

dolvol

80

Return on invested capital

Brown and Rowe

2007

WP

roic

81

Failure probability, monthly

Campbell, Hilscher, and Szilagyi

2008

JF

82

Failure probaility

Campbell, Hilscher, and Szilagyi

2008

JF

83

Earnings announcement return (Chan et al.)

Chan, Jegadeesh, and Lakonishok

1996

JF

84

Advertising expense to market

Chan, Lakonishok, and Sougiannis

2001

JF

85

R&D to market

Chan, Lakonishok, and Sougiannis

2001

JF

rd_me

86

R&D to market (quarterly)

Chan, Lakonishok, and Sougiannis

2001

JF

87

R&D to sales

Chan, Lakonishok, and Sougiannis

2001

JF

rd_sale

88

R&D to sales (quarterly)

Chan, Lakonishok, and Sougiannis

2001

JF

89

Cash productivity

Chandrashekar and Rao

2009

WP

cashpr

90

CAPEX and inventory

Chen and Zhang

2010

JF

invest

91

Volatility of dollar trading volume (GHZ)

Chordia, Subrahmanyam, and Anshuman

2001

JFE

std_dolvol

92

Volatility of dollar trading volume (JKP)

Chordia, Subrahmanyam, and Anshuman

2001

JFE

dolvol_var_126d

93

Volatility of share turnover (GHZ)

Chordia, Subrahmanyam, and Anshuman

2001

JFE

std_turn

94

Volatility of share turnover (JKP)

Chordia, Subrahmanyam, and Anshuman

2001

JFE

turnover_var_126d

95

Customer momentum

Cohen and Frazzini

2008

JF

96

Segment momentum

Cohen and Lou

2012

JFE

97

Asset growth

Cooper, Gulen, and Schill

2008

JF

at_gr1

98

Asset growth (quarterly)

Cooper, Gulen, and Schill

2008

JF

99

High-low bid-ask spread

Corwin and Schultz

2012

JF

bidaskhl_21d

100

Disparing between long- and short-term earnings growth forecasts

Da and Warachka

2011

JFE

101

Composite equity issuance (Org)

Daniel and Titman

2006

JF

eqnpo_60m

102

Composite equity issuance (JKP, 12 months)

Daniel and Titman

2006

JF

eqnpo_12m

103

Intangible return

Daniel and Titman

2006

JF

104

Share turnover (JKP)

Datar, Naik, and Radcliffe

1998

JFM

turnover_126d

105

Share turnover (Org, GHZ)

Datar, Naik, and Radcliffe

1998

JFM

turn

106

Long-term reversal (12-36)

De Bondt and Thaler

1985

JF

ret_36_12

107

Long-term reversal (12-60)

De Bondt and Thaler

1985

JF

ret_60_12

108

Equity duration

Dechow, Sloan, and Soliman

2004

RAS

eq_dur

109

Operating Cash flows to price (JKP)

Desai, Rajgopal, and Venkatachalam

2004

AR

ocf_me

110

Operating Cash flows to price (Org, GHZ)

Desai, Rajgopal, and Venkatachalam

2004

AR

cfp

111

Operating Cash flows to price (quarterly)

Desai, Rajgopal, and Venkatachalam

2004

AR

112

Altman Z-score

Dichev

1998

JF

z_score

113

Altman Z-score (quarterly)

Dichev

1998

JF

114

Ohlson O-score

Dichev

1998

JF

o_score

115

Ohlson O-Score (quarterly)

Dichev

1998

JF

116

Credit Rating Downgrade

Dichev and Piotroski

2001

JF

117

Dispersion in analysts’ earnings forecasts

Diether, Malloy, and Scherbina

2002

JF

118

Dimson Beta

Dimson

1979

JFE

beta_dimson_21d

119

Probability of informed trading

Easley, Hvidkjaer, and O’Hara

2002

JF

120

Unexpected R&D increase

Eberhart, Maxwell, and Siddique

2004

JF

rd

121

Industry-adjusted organizational capital

Eisfeldt and Papanikolaou

2013

JF

122

Organization capital/assets

Eisfeldt and Papanikolaou

2013

JF

123

Analysts coverage

Elgers, Lo, and Pfeiffer

2001

AR

124

Analysts’ earnings forecast-to-price

Elgers, Lo, and Pfeiffer

2001

AR

125

Change in long-term net operating assets

Fairfield, Whisenant, and Yohn

2003

AR

lnoa_gr1a

126

Assets-to-market

Fama and French

1992

JF

at_me

127

Assets-to-market (quarterly)

Fama and French

1992

JF

128

Book leverage

Fama and French

1992

JF

at_be

129

Book leverage (quarterly)

Fama and French

1992

JF

130

Operating profits to book equity (JKP)

Fama and French

2015

JFE

ope_be

131

Operating profits to book equity (GHZ, Org)

Fama and French

2015

JFE

operprof

132

Operating profits to lagged book equity

Fama and French

2015

JFE

ope_bel1

133

Operating profits to lagged equity (quarterly)

Fama and French

2015

JFE

134

Beta squared (GHZ)

Fama and MacBeth

1973

JPE

betasq

135

Market beta (GHZ)

Fama and MacBeth

1973

JPE

beta

136

Market beta (Org, JKP)

Fama and MacBeth

1973

JPE

beta_60m

137

Earnings surprise

Foster, Olsen, and Shevlin

1984

AR

niq_su

138

Earnings conservatism

Francis et al.

2004

AR

139

Earnings persistence

Francis et al.

2004

AR

ni_ar1

140

Earnings predictability

Francis et al.

2004

AR

ni_ivol

141

Earnings smoothness

Francis et al.

2004

AR

earnings_variability

142

Earnings timeliness

Francis et al.

2004

AR

143

ROA volatility

Francis et al.

2004

AR

roavol

144

Value relevance of earnings

Francis et al.

2004

AR

145

Accrual quality

Francis et al.

2005

JAE

146

Accrual quality (quarterly)

Francis et al.

2005

JAE

147

Analysts optimism

Frankel and Lee

1998

JAE

148

Analysts-based intrinsic value-to-market

Frankel and Lee

1998

JAE

149

Intrinsic value-to-market

Frankel and Lee

1998

JAE

intrinsic_value

150

Predicted analysts focecast error

Frankel and Lee

1998

JAE

151

Pension funding rate (scaled by market equity)

Franzoni and Marin

2006

JF

152

Persion funding rate (scaled by assets)

Franzoni and Marin

2006

JF

153

Frazzini-Pedersen beta

Frazzini and Pedersen

2014

JFE

betabab_1260d

154

52-week high

George and Hwang

2004

JF

prc_highprc_252d

155

Change in 6-month momentum

Gettleman and Marks

2006

WP

chmom

156

Corporate governance index

Gompers, Ishii, and Metrick

2003

QJE

157

Percent discretionary accruals

Hafzalla, Lundholm, and Van Winkle

2011

AR

158

Percent operating accruals (JKP)

Hafzalla, Lundholm, and Van Winkle

2011

AR

oaccruals_ni

159

Percent operating accruals (GHZ, Org)

Hafzalla, Lundholm, and Van Winkle

2011

AR

pctacc

160

Percent total accruals

Hafzalla, Lundholm, and Van Winkle

2011

AR

taccruals_ni

161

Tangibility

Hahn and Lee

2009

JF

tangibility

162

Tangibility (quarterly)

Hahn and Lee

2009

JF

163

Trend factor

Han, Zhou, and Zhu

2016

JFE

trend_factor

164

Coskewness

Harvey and Siddique

2000

JF

coskew_21d

165

Capital turnover

Haugen and Baker

1996

JFE

at_turnover

166

Capital turnover (quarterly)

Haugen and Baker

1996

JFE

167

Return on equity

Haugen and Baker

1996

JFE

ni_be

168

Analysts’ forecast change

Hawkins, Chamberlin, and Daniel

1984

FAJ

169

Revisions in analyst’s’ earnings forecasts

Hawkins, Chamberlin, and Daniel

1984

FAJ

170

Year 1-lagged return, annual

Heston and Sadka

2008

JFE

seas_1_1an

171

Year 1-lagged return, nonannual

Heston and Sadka

2008

JFE

seas_1_1na

172

Years 2-5 lagged returns, annual

Heston and Sadka

2008

JFE

seas_2_5an

173

Years 2-5 lagged returns, nonannual

Heston and Sadka

2008

JFE

seas_2_5na

174

Years 6-10 lagged returns, annual

Heston and Sadka

2008

JFE

seas_6_10an

175

Years 6-10 lagged returns, nonannual

Heston and Sadka

2008

JFE

seas_6_10na

176

Years 11-15 lagged returns, annual

Heston and Sadka

2008

JFE

seas_11_15an

177

Years 11-15 lagged returns, nonannual

Heston and Sadka

2008

JFE

seas_11_15na

178

Years 16-20 lagged returns, annual

Heston and Sadka

2008

JFE

seas_16_20an

179

Years 16-20 lagged returns, nonannual

Heston and Sadka

2008

JFE

seas_16_20na

180

Citations to R&D expenses

Hirschleifer, Hsu, and Li

2013

JFE

181

Patents to R&D expenses

Hirschleifer, Hsu, and Li

2013

JFE

182

Change in net operating assets

Hirshleifer et al.

2004

JAE

noa_gr1a

183

Net operating assets

Hirshleifer et al.

2004

JAE

noa_at

184

Change in depreciation to PP&E

Holthausen and Larcker

1992

JAE

pchdepr

185

Depreciation to PP&E

Holthausen and Larcker

1992

JAE

depr

186

Sin stock

Hong and Kacperczyk

2009

JFE

sin

187

Industry lead-lag effect in earnings surprises

Hou

2007

RFS

188

Industry lead-lag effect in prior returns

Hou

2007

RFS

189

Bid-ask spread (TAQ)

Hou and Loh

2016

JFE

190

Price delay based on R-squared

Hou and Moskowitz

2005

RFS

pricedelay

191

Price delay based on SE-adjusted slopes

Hou and Moskowitz

2005

RFS

192

Price delay based on slopes

Hou and Moskowitz

2005

RFS

pricedelay_slope

193

Industry concentration (book equity)

Hou and Robinson

2006

JF

herf_be

194

Industry concentration (sales)

Hou and Robinson

2006

JF

herf_sale

195

Industry concentration (total assets)

Hou and Robinson

2006

JF

herf_at

196

Return on equity (quarterly)

Hou, Xue, and Zhang

2015

RFS

niq_be

197

Cash flow volatility

Huang

2009

JEF

ocfq_saleq_std

198

Short-term reversal

Jegadeesh

1990

JF

ret_1_0

199

Revenue surprise

Jegadeesh and Livnat

2006

JFE

saleq_su

200

Momentum (12 month)

Jegadeesh and Titman

1993

JF

ret_12_1

201

Momentum (3 month)

Jegadeesh and Titman

1993

JF

ret_3_1

202

Momentum (6 month)

Jegadeesh and Titman

1993

JF

ret_6_1

203

Momentum (9 month)

Jegadeesh and Titman

1993

JF

ret_9_1

204

Firm age

Jiang, Lee, and Zhang

2005

RAS

age

205

Revenue surprise (Karma)

Karma

2009

JBFA

rsup

206

Tail risk

Kelly and Jiang

2014

RFS

207

Earnings announcement return (Kishore et al.)

Kishore et al.

2008

WP

208

Long-term EPS forecast

La Porta

1996

JF

209

Long-term EPS forecast (monthly sort)

La Porta

1996

JF

210

Annual sales growth

Lakonishok, Shleifer, and Vishny

1994

JF

sale_gr1

211

Annual sales growth (quarterly)

Lakonishok, Shleifer, and Vishny

1994

JF

212

Cash flow-to-price

Lakonishok, Shleifer, and Vishny

1994

JF

fcf_me

213

Cash flow-to-price (quarterly)

Lakonishok, Shleifer, and Vishny

1994

JF

214

Five-year sales growth rank

Lakonishok, Shleifer, and Vishny

1994

JF

215

Three-year sales growth

Lakonishok, Shleifer, and Vishny

1994

JF

sale_gr3

216

Kaplan-Zingales index

Lamont, Polk, and Saa-Requejo

2001

RFS

kz_index

217

Kaplan-Zingales index (quarterly)

Lamont, Polk, and Saa-Requejo

2001

RFS

218

Abnormal volume in earnings announcement month

Lerman, Livnat, and Mendenhall

2008

WP

219

Taxable income to income (JKP)

Lev and Nissim

2004

AR

pi_nix

220

Taxable income to income (Org, GHZ)

Lev and Nissim

2004

AR

tb

221

Taxable income to income (quarterly)

Lev and Nissim

2004

AR

222

R&D capital-to-assets

Li

2011

RFS

rd5_at

223

Dividend yield (JKP)

Litzenberger and Ramaswamy

1979

JF

div12m_me

224

Dividend yield (GHZ)

Litzenberger and Ramaswamy

1979

JF

dy

225

Dividend yield (quarterly)

Litzenberger and Ramaswamy

1979

JF

226

Zero-trading days (1 month)

Liu

2006

JFE

zero_trades_21d

227

Zero-trading days (6 months)

Liu

2006

JFE

zero_trades_126d

228

Zero-trading days (12 months)

Liu

2006

JFE

zero_trades_252d

229

Growth in advertising expenses

Lou

2014

RFS

230

Initial public offerings

Loughran and Ritter

1995

JF

ipo

231

Enterprise multiple

Loughran and Wellman

2011

JFQA

enterprise_multiple

232

Enterprise multiple (JKP)

Loughran and Wellman

2011

JFQA

ebitda_mev

233

Enterprise multiple (quarterly)

Loughran and Wellman

2011

JFQA

234

Changes in PPE and inventory/assets

Lyandres, Sun, and Zhang

2008

RFS

ppeinv_gr1a

235

Composite debt issuance

Lyandres, Sun, and Zhang

2008

RFS

debt_gr3

236

Customer industries momentum

Menzly and Ozbas

2010

JF

237

Supplier industries momentum

Menzly and Ozbas

2010

JF

238

Dividend initiation

Michaely, Thaler, and Womack

1995

JF

divi

239

Dividend omission

Michaely, Thaler, and Womack

1995

JF

divo

240

Share price

Miller and Scholes

1982

JPE

price

241

Mohanram G-score

Mohanram

2005

RAS

242

Industry momentum

Moskowitz and Grinblatt

1999

JFE

indmom

243

Operating leverage

Novy-Marx

2011

JFE

opex_at

244

Operating leverage (quarterly)

Novy-Marx

2011

JFE

245

Intermediate momentum (7-12)

Novy-Marx

2012

ROF

ret_12_6

246

Gross profits-to-assets

Novy-Marx

2013

JFE

gp_at

247

Gross profits-to-lagged assets

Novy-Marx

2013

JFE

gp_atl1

248

Gross profits-to-lagged assets (quarterly)

Novy-Marx

2013

JFE

249

Asset liquidity to book assets

Ortiz-Molina and Phillips

2014

JFQA

aliq_at

250

Asset liquidity to book assets (quarterly)

Ortiz-Molina and Phillips

2014

JFQA

251

Asset liquidity to market assets

Ortiz-Molina and Phillips

2014

JFQA

aliq_mat

252

Asset liquidity to market assets (quarterly)

Ortiz-Molina and Phillips

2014

JFQA

253

Cash flow-to-debt

Ou and Penman

1989

JAR

cashdebt

254

Change in current ratio

Ou and Penman

1989

JAR

pchcurrat

255

Change in quick ratio

Ou and Penman

1989

JAR

pchquick

256

Change in sales to inventory

Ou and Penman

1989

JAR

pchsaleinv

257

Current ratio

Ou and Penman

1989

JAR

currat

258

Quick ratio

Ou and Penman

1989

JAR

quick

259

Sales-to-cash

Ou and Penman

1989

JAR

salecash

260

Sales-to-inventory

Ou and Penman

1989

JAR

saleinv

261

Sales-to-receivables

Ou and Penman

1989

JAR

salerec

262

Cash-to-assets

Palazzo

2012

JFE

cash_at

263

Pastor-Stambaugh liquidity beta

Pastor and Stambaugh

2003

JPE

264

Book-to-market enterprise value

Penman, Richardson, and Tuna

2007

JAR

bev_mev

265

Book-to-market enterprise value (quarterly)

Penman, Richardson, and Tuna

2007

JAR

266

Net debt-to-price

Penman, Richardson, and Tuna

2007

JAR

netdebt_me

267

Net debt-to-price (quarterly)

Penman, Richardson, and Tuna

2007

JAR

268

Piotroski F-score (JKP)

Piotroski

2000

AR

f_score

269

Piotroski F-score (GHZ, Org)

Piotroski

2000

AR

ps

270

Piotroski F-score (quarterly)

Piotroski

2000

AR

271

Net stock issues (JKP, Org)

Pontiff and Woodgate

2008

JF

chcsho_12m

272

Net stock issues (GHZ)

Pontiff and Woodgate

2008

JF

chcsho

273

Order backlog

Rajgopal, Shevlin, and Venkatachalam

2003

RAS

274

Unexpected quarterly earnings

Rendelman, Jones, and Latane

1982

JFE

275

Chage in common equity

Richardson et al.

2005

JAE

be_gr1a

276

Chagne in long-term investments

Richardson et al.

2005

JAE

lti_gr1a

277

Change in current Ooperating liabilities

Richardson et al.

2005

JAE

col_gr1a

278

Change in current operating assets

Richardson et al.

2005

JAE

coa_gr1a

279

Change in financial liabilities

Richardson et al.

2005

JAE

fnl_gr1a

280

Change in long-term debt

Richardson et al.

2005

JAE

lgr

281

Change in net financial assets

Richardson et al.

2005

JAE

nfna_gr1a

282

Change in net non-cash working capital

Richardson et al.

2005

JAE

cowc_gr1a

283

Change in net non-current operating assets

Richardson et al.

2005

JAE

nncoa_gr1a

284

Change in non-current operating assets

Richardson et al.

2005

JAE

ncoa_gr1a

285

Change in non-current operating liabilities

Richardson et al.

2005

JAE

ncol_gr1a

286

Change in short-term investments

Richardson et al.

2005

JAE

sti_gr1a

287

Total accruals

Richardson et al.

2005

JAE

taccruals_at

288

Book to market (December ME, quarterly)

Rosenberg, Reid, and Lanstein

1985

JF

289

Book-to-market (December ME)

Rosenberg, Reid, and Lanstein

1985

JF

be_me

290

Change in analyst coverage

Scherbina

2008

ROF

291

Operating accruals (JKP)

Sloan

1996

AR

oaccruals_at

292

Operating accruals (GHZ, Org)

Sloan

1996

AR

acc

293

Asset turnover

Soliman

2008

AR

sale_bev

294

Asset turnover (quarterly)

Soliman

2008

AR

295

Change in asset turnover

Soliman

2008

AR

chatoia

296

Change in profit margin

Soliman

2008

AR

chpmia

297

Profit margin

Soliman

2008

AR

ebit_sale

298

Profit margin (quarterly)

Soliman

2008

AR

299

Return on net operating assets

Soliman

2008

AR

ebit_bev

300

Return on net operating assets (quarterly)

Soliman

2008

AR

301

Mispricing factor: Management

Stambaugh and Yuan

2016

RFS

mispricing_mgmt

302

Mispricing factor: Performance

Stambaugh and Yuan

2016

RFS

mispricing_perf

303

Inventory change

Thomas and Zhang

2002

RAS

inv_gr1a

304

Tax expense surprise

Thomas and Zhang

2011

JAR

chtx

305

Abnormal corporate investment

Titman, Wei, and Xie

2004

JFQA

capex_abn

306

Real estate holdings

Tuzel

2010

RFS

realestate

307

Convertible debt indicator

Valta

2016

JFQA

convind

308

Convertible debt-to-total debt

Valta

2016

JFQA

309

Secured debt indicator

Valta

2016

JFQA

securedind

310

Secured debt-to-total debt

Valta

2016

JFQA

secured

311

Whited-Wu index

Whited and Wu

2006

RFS

312

Whited-Wu index (quarterly)

Whited and Wu

2006

RFS

313

CAPEX growth (1 year)

Xie

2001

AR

capx_gr1

314

Discretionay accruals

Xie

2001

AR

Structure

PyAnomaly consists of various modules and the core modules you are likely to use are as follows. The full list of the modules and their details can be found in the API documentation (pyanomaly).

  • wrdsdata.py: WRDS, a class to handle data downloading from WRDS is defined here.

  • panel.py: Panel, a base class to handle panel data is defined here.

  • characteristics.py: Classes to generate firm characteristics are defined here. These classes are derived from Panel.

    • FUNDA: A class to generate firm characteristics from funda.

    • FUNDQ: A class to generate firm characteristics from fundq.

    • CRSPM: A class to generate firm characteristics from crspm.

    • CRSPD: A class to generate firm characteristics from crspd.

    • Merge: A class to generate firm characteristics from a merged dataset of funda, fundq, crspm, and crspd.

  • analytics.py: A module that defines functions for analytics, such as 1-D sort, cross-sectional regression, and time-series regression.

  • datatools.py: A module that defines functions for data handling, such as data filtering, trimming, and winsorizing.

System Requirement

Recommendation
  • Disc space: minimum 100 GB

  • Memory: minimum 64 GB

The minimum system requirement depends on the configuration, e.g., what characteristics to generate or the sample period.

Disc space

The raw data downloaded from WRDS take up about 27 GB of the disc space. The final output file can take up to 15 GB if all characteristics are generated and the raw data are saved together. The size of the output file can be significantly reduced if only the firm characteristics are saved (less than 5 GB). In general, 100GB should be sufficient in all types of tasks and even when interim results are saved.

Memory

Generating firm characteristics from daily data such as crspd consumes a significant amount of memory. The memory usage can be as much as 50 GB at a peak. This does not mean you need a physical memory of this size. Most OS will use Paging File to allocate some of the disc space as memory, although using Paging File will increase the running time.

Comparison to Other Sources

PyAnomaly benefits greatly from the SAS codes of Green et al. (2017) and Jensen et al. (2021), and also from the papers and documentations of Hou et al. (2020) and Chen and Zimmermann (2020). We generally follow the SAS codes of JKP and GHZ and validate our code against them, but when their implementation is significantly different from the original definition, we try to follow the original definition. When the implementation of a firm characteristic is significantly different between the two sources, we implement both implementations using different function names. We also found several mistakes in these codes. For those mistakes we found and the differences between our implementation and theirs, we make a note in the mapping file and comments in the code. The SAS code of Jensen et al. (2021) has been updated several times while we develop PyAnomaly and some of the comments we documented may no longer be valid.

Comparison to the SAS code of Jensen et al. (2021)

PyAnomaly can be configured so that it replicates JKP’s SAS code as closely as possible. However, there are a few key differences that make our results differ from theirs.

Market equity

JKP use not only CRSP’s msf but also Compustat’s secm and secd to calculate market equity, and (roughly speaking) choose the maximum market equity among those calculated from different sources. We only use the price and shares outstanding from CRSP to calculate the market equity.

Merging FUNDA with FUNDQ

JKP quarterly-update annual accounting variables using comp.fundq. More specifically, JKP create same characteristics in funda and fundq separately and merge them. On the other hand, we merge the raw data first and then generate characteristics. Since some variables in funda are not available in fundq, eg, ebitda, JKP make those unavailable variables from other variables and create characteristics, even when they are available in funda. We prefer to merge funda with fundq at the raw data level and create characteristics from the merged data.

Share code filtering

JKP do not filter data using CRSP share code (shrcd), whereas we only use ordinary common stocks (shrcd = 10, 11, or 12). We find that some stocks’ shrcd changes over time. Therefore, this difference does not only affect the cross-section but also affects time-series.

References

Glossary

  • crspd: CRSP daily data created from dsf, dsenames, and dseall.

  • crspm: CRSP monthly data created from msf, msenames, and mseall.

  • funda: Compustat annual accounting data created from funda.

  • fundq: Compustat quarterly accounting data created from fundq.

  • CZ: Either the paper or the R/Stata code of Chen and Zimmermann (2020).

  • GHZ: Either the paper or the SAS code of Green, Hand, and Zhang (2017).

  • HXZ: Hou, Xue, and Zhang (2020).

  • JKP: Either the paper or the SAS code of Jensen, Kelly, and Pedersen (2021).