Data used in the neural network experiments

Updated: July 21, 2017

Elec: electronic product reviews for sentiment classification

These datasets were derived from a large collection of Amazon reviews [ML13]. Note that the data should be used for research purposes only.

RCV1 (Reuters Corpus Version 1)

Information on how to obtain RCV1 from NIST is here.

References

[JZ16] Rie Johnson and Tong Zhang. Supervised and semi-supervised text categorization using LSTM for region embeddings. ICML 2016.
[JZ15b] Rie Johnson and Tong Zhang. Semi-supervised convolutional neural networks for text categorization via region embedding. NIPS 2015.
[JZ15a] Rie Johnson and Tong Zhang. Effective use of word order for text categorization with convolutional neural networks. NAACL-HLT 2015.
[ML13] Julian McAuley and Jure Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. RecSys, 2013.
[LYRL04] David D. Lewis, Yiming Yang, Tony G. Rose, and Fan Li. RCV1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5:361-397, 2004.

Y