CONTEXT v3: Convolutional neural networks and LSTM for text categorization in C++ on GPU

Updated: June 1, 2017.    Latest version: CONTEXT v3.00 (May 26, 2016).    Next version: CONTEXT v4.00 (late July 2017)
CONTEXT provides (and will provide) an implementation of the following types of neural network for text categorization:

Looking for a tool?

NOTE1: The code runs only on GPU (graphics processing unit). That is, your system must have a GPU to run this code.
NOTE2: The code was tested on Linux with gcc, and in principle, it should compile and run on other systems (e.g., Windows) as well, provided that the prerequisites are installed (see README for details). But no guarantee. Windows users have kindly informed that it works fine on their Windows.

Download

Documentation

pdf    html

Getting started

  1. Download the code and extract the files, and read README.
  2. Go to the top directory and build executables by make.
  3. Go to sample/ and enter ./sample.sh to train and test a CNN on small data to confirm installation.

    NOTE: If the compute capability of your GPU card is higher than 3.5 (e.g., the Maxwell or Pascal architecture), sample.sh will probably result in either an error (e.g., invalid device function) or a wrong error rate that is much worse than 0.1725. If this happens, modify makefile to include the compute capability of your GPU (e.g., if it is 5.2, add -gencode arch=compute_52,code=sm_52), rebuild executables by entering make clean and make, and retry sample.sh. You can find the compute capability here in Wikipedia. It can be also found by entering gpuDevice in matlab.

  4. To get started with ...

To reproduce the experiments in the papers

Looking for a baseline?    Note that if a supervised CNN of [JZ15a] outperforms, say, a CNN using word vectors pre-trained on additional unlabeled data, this is rather surprising (though possible), since the word-vector CNN is empowered by a large amount of extra information from additional unlabeled data while the supervised CNN is not. A more `apples-to-apples' comparison would be, say, the CNN with pre-trained word vectors vs. semi-supervised CNN as in [JZ15b] or [JZ17] instead.

Data Source

The data files in the code/data archives were derived from Large Movie Review Dataset (IMDB) [MDPHN11] and Amazon reviews [ML13].

License

This program is free software issued under the GNU General Public License V3 .

References

[JZ17] Rie Johnson and Tong Zhang. Deep pyramid convolutional neural networks for text categorization. To appear in ACL 2017.
[JZ16b] Rie Johnson and Tong Zhang. Convolutional neural networks for text categorization: shallow word-level vs. deep character-level. arXiv:1609.00718, 2016.
[JZ16a] Rie Johnson and Tong Zhang. Supervised and semi-supervised text categorization using LSTM for region embeddings. ICML 2016.
[JZ15b] Rie Johnson and Tong Zhang. Semi-supervised convolutional neural networks for text categorization via region embedding. NIPS 2015.
[JZ15a] Rie Johnson and Tong Zhang. Effective use of word order for text categorization with convolutional neural networks. NAACL-HLT 2015.
[ML13] Julian McAuley and Jure Leskovec. Hidden factors and hidden topics: understanding rating dimensions with review text. RecSys, 2013.
[MDPHN11] Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. Learning word vectors for sentiment analysis. ACL, 2011.