Resource files used by example shell scripts of ConText v4
Some of the example shell scripts for semi-supervised learning,
included in CONTEXT,
use files that are not included in the package, due to their sizes.
The shell scripts download them automatically as needed.
However, if for some reason it is more convenient to download them in advance,
they can be manually downloaded from this page.
Instructions
- Download the file you need by clicking the link below.
- Set the current directory to examples/ and extract the file content by tar -xvf,
for example,
cd examples
tar -xvf imdb-unlab.txt.tok.tar.gz
Doing so places the resource files at examples/for-semi.
The example shell scripts do not attempt downloading if they find the required files at examples/for_semi.
Unlabeled tokenized text files
- IMDB unlabeled data, used in examples/*{unsup|parsup|2unsemb}*imdb*.sh
- Elec unlabeled data, used in examples/*{unsup|parsup|2unsemb}*elec*.sh
Optional resource files
Some of the example shell scripts use the results of training conducted by another example shell script
(e.g., unsupervised embedding training).
For convenience, those training results are provided.
As said above, the example shell scripts attempt to download and decompress them as needed, and
so they should be downloaded from this page only if it is more convenient to do so for some reason.
Please follow the instructions above.
NOTE1:
Use of the files below are optional. You can generate these files
by yourself using the example shell scripts.
NOTE2:
The files are in the little-endian format (Intel convention), and they
cannot be used in the systems with Big Endian (Motorola convention).
Optional resource files for IMDB
- imdb-unsemb-v4.tar.gz : used in examples/*{3|5}unsemb*imdb*.sh;
containing 5 unsupervised embedding files (individual files can be downloaded by clicking *):
- imdb-uns-p5.dim100.epo10.ReLayer0.tar.gz *
- imdb-unsx3-p5.dim100.epo10.ReLayer0.tar.gz *
- imdb-parsup-p3p5.dim100.epo10.ReLayer0.tar.gz *
- imdb-LstmF-dim100.lay.epo30.ReLayer0.tar.gz *
- imdb-LstmB-dim100.lay.epo30.ReLayer0.tar.gz *
- for-parsup-imdb-p3.supmod.ReNet.tar.gz :
used in examples/*parsup*imdb*.sh; model file.
Optional resource files for Elec
- elec-unsemb-v4.tar.gz : used in examples/*{3|5}unsemb*elec*.sh;
containing 5 unsupervised embedding files (individual files can be downloaded by clicking *):
- elec-uns-p5.dim100.epo10.ReLayer0.tar.gz *
- elec-unsx3-p5.dim100.epo10.ReLayer0.tar.gz *
- elec-parsup-p3p5.dim100.epo10.ReLayer0.tar.gz *
- elec-LstmF-dim100.lay.epo30.ReLayer0.tar.gz *
- elec-LstmB-dim100.lay.epo30.ReLayer0.tar.gz *
- for-parsup-elec-p3.supmod.ReNet.tar.gz :
used in examples/*parsup*elec*.sh; model file.
Optional resource files for RCV1
- rcv1-unsemb-v4.tar.gz : used in examples/other-sh/*{3|5}unsemb*rcv1*.sh;
containing 5 unsupervised embedding files (individual files can be downloaded by clicking *):
- rcv1-uns-p20.dim100.epo10.ReLayer0.tar.gz *
- rcv1-unsx3-p20.dim100.epo10.ReLayer0.tar.gz *
- rcv1-parsup-p20p20.dim100.epo10.ReLayer0.tar.gz *
- rcv1-LstmF-dim300.lay.epo50.ReLayer0.tar.gz *
- rcv1-LstmB-dim300.lay.epo50.ReLayer0.tar.gz *
- for-parsup-rcv1-p20.supmod.ReNet.tar.gz :
used in examples/other-sh/*parsup*rcv1*.sh; model file.