Rie Johnson home page

Rie Johnson

Updated: October 2023.

I worked for IBM T.J. Watson Research Center as Research Staff Member (which would be called Research Scientist elsewhere) until 2007. Since then I have been enjoying being an independent researcher. My early publications before May 11, 2007 are under my former name Ando. I have a PhD in Computer Science from Cornell University.

Research interests: Machine learning.

Jet's pictures Megan's pictures

Recent publications

Kaggle Competitions

Heritage Health Prize (2013): First Prize; team POWERDOT. video
This was a two-year-long contest with $3M Grand Prize though no team reached the Grand Prize threshold. After Milestone 3 (see below), crescendo merged with two other teams and became POWERDOT.

Heritage Health Prize Round 3 Milestone (2012): Second Prize; team crescendo with Tong Zhang. video

Bond Trade Price Challenge (2012): First Prize, with Tong Zhang.

Predicting a Biological Response (2012): Fourth place, with Tong Zhang.

The original motivation for participating in these competitions was to test Regularized Greedy Forest in a competitive setting. paper.

Older Publications

Conference and Journal Papers

Graph-based Semi-supervised Learning and Spectral Kernel Design. Rie Johnson and Tong Zhang. IEEE Transactions on Information Theory, 54(1):275-288, 2008.

Word Sense Disambiguation across Two Domains: Biomedical Literature and Clinical Notes. Guergana Savova, Anni Coden, Igor Sominsky, Rie Johnson, Philip Ogren, Piet de Groen, and Christopher Chute. Journal of Biomedical Informatics. 2008.

On the Effectiveness of Laplacian Normalization for Graph Semi-Supervised Learning. Rie Johnson and Tong Zhang. Journal of Machine Learning Research, Vol(8):1489-1517, 2007.

Two-view Feature Generation Model for Semi-supervised Learning. Rie K. Ando and Tong Zhang. Proceedings of the 24th International Conference on Machine Learning (ICML). 2007.

BioCreative II Gene Mention Tagging System at IBM Watson. Rie K. Ando. Proceedings of the Second BioCreative Challenge Evaluation Workshop. 2007. First place among the 21 teams.

TimeBank Evolution as a Community Resource for TimeML Parsing. Branimir Boguraev, James Pustejovsky, Rie Ando, and Marc Verhagen. Language Resources and Evaluation, 41:91-115, 2007.

Learning on Graph with Laplacian Regularization. Rie K. Ando, Tong Zhang. Neural Information Processing Systems (NIPS-2006).

Applying Alternating Structure Optimization to Word Sense Disambiguation. Rie K. Ando. Proceedings of the 10th Conference on Computational Natural Language Learning (CoNLL), 2006. Won the Best Paper Award.

Analysis of TimeBank as a Resource for TimeML parsing. Branimir Boguraev, Rie K. Ando. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC), 2006.

A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data. Rie K. Ando and Tong Zhang. Journal of Machine Learning Research, Vol 6:1817-1853, 2005.

A High-Performance Semi-Supervised Learning Method for Text Chunking. Rie K. Ando and Tong Zhang. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL), 2005.

TimeML-Compliant Text Analysis for Temporal Reasoning. Branimir Boguraev and Rie K. Ando. Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI), 2005.

Analysis of Spectral Kernel Design based Semi-supervised Learning. Tong Zhang and Rie K. Ando. Neural Information Processing Systems (NIPS), 2005.

TREC 2005 Genomics Track Experiments at IBM Watson. Rie K. Ando, Mark Dredze, and Tong Zhang. Proceedings of the Fourteenth Text REtrieval Conference (TREC 2005), 2005.

Visualization-Enabled Multi-Document Summarization by Iterative Residual Rescaling. Rie K. Ando, Branimir K. Boguraev, Roy J. Byrd, and Mary S. Neff. Natural Language Engineering, 11(1), 2005.

Domain-specific Language Models and Lexicons for Tagging. Anni R. Coden, Serguei V. Pakhomov, Rie K. Ando, Patrick H. Duffy, Christopher G. Chute. Journal of Biomedical Informatics. 2005.

Semantic Lexicon Construction: Learning from Unlabeled Data via Spectral Analysis. Rie K. Ando. Proceedings of the Eighth Conference on Natural Language Learning (CoNLL), 2004.

Exploiting Unannotated Corpora for Tagging and Chunking. Rie K. Ando. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL), 2004. Short paper.

Mostly-Unsupervised Statistical Segmentation of Japanese Kanji Sequences. Rie K. Ando and Lillian Lee. Natural Language Engineering, 9(2), 2003.

Iterative Residual Rescaling: Analyzing and Generalizing LSI. Rie K. Ando and Lillian Lee. Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001.

Latent Semantic Space: Iterative Scaling Improves Precision of Inter-document Similarity Measurement. Rie K. Ando. Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2000.

Mostly-Unsupervised Statistical Segmentation of Japanese: Applications to Kanji. Rie K. Ando and Lillian Lee. Proceedings of the First Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), 2000.

Multi-document Summarization by Visualizing Topical Content. Rie K. Ando, Branimir K. Boguraev, Roy J. Byrd, and Mary S. Neff. Proceedings of ANLP/NAACL Workshop on Automatic Summarization, 2000.

PhD Thesis
The Document Representation Problem: An Analysis of LSI and Iterative Residual Rescaling. Rie Ando. 2001. Cornell Computer Science Technical Report TR2001-1843.