Textpresso is an online literature search and curation platform covering research papers on model organisms and other biomedical subjects. It enables biocurators and biomedical researchers to search and mine the full text of literature by integrating keyword and category searches with viewing search results in the context of the full text. It also allows users to create customized curation interfaces, use those interfaces to make annotations linked to supporting evidence statements, and then send those annotations to any database in the world. We also provide text mining and text classification support for curators of biomedical databases to identify and extract biological entities and facts from the full text of research articles.

To successfully navigate through the site please note:

  • We are currently transitioning from the old system (pure search engine) to our new system (search engine and curation platform). You can use the buttons below to proceed directly to the site of your choice.

    Search engine and curation platform (new site) Search engine only (old site)

  • During the transition period both sites continue to be supported, but eventually the old site will be retired. We recommend using the new site only.
  • You can browse through other relevant information about Textpresso by using the navigation bar in the upper right corner.

Textpresso Sites

Textpresso Central

PubMed Central Open Access subset (~ 1.5 million papers) and the Wormbase C. elegans bibliography.

Textpresso for C. elegans

20,000 full text articles of the Wormbase C. elegans bibliography.

Textpresso for Mouse

114,000 full text papers pertaining to the model organism mouse published up to the year 2011.

Textpresso for Fly

Corpus contains 44,850 full text papers and 50,227 abstracts up to the year 2012.

Textpresso for Neuroscience

200,000 Neuroscience published up to 2009.

Textpresso for Arabidopsis

40,000 A. thaliana papers. Last update: June 2015.

Textpresso for Nematode

Contains around 17,000 full text Nematode papers.

Other Textpresso Sites

More sites with smaller corpora, some of them are pilot projects.


Textpresso was originally described in:

Textpresso: an ontology-based information retrieval and extraction system for biological literature
Müller HM, Kenny EE, Sternberg PW.
PLoS Biol. 2004 Nov;2(11):e309. Epub 2004 Sep 21.

Textpresso Central, the new curation platform, is described in:

Textpresso Central: a customizable platform for searching, text mining, viewing, and curating biomedical literature
Müller HM, Van Auken KM, Li Y, Sternberg PW.
Accepted for publication in BMC Bioinformatics.

Please use these references if you are citing Textpresso.

Click here for papers citing Textpresso that are available on PubMedCentral.

Papers from or with significant contributions from the Textpresso group

BC4GO: a full-text corpus for the BioCreative IV GO task.
Van Auken K, Schaeffer ML, McQuilton P, Laulederkind SJ, Li D, Wang SJ, Hayman GT, Tweedie S, Arighi CN, Done J, Müller HM, Sternberg PW, Mao Y, Wei CH, Lu Z.
Database (Oxford). 2014 Jul 28;2014:bau074.

An overview of the BioCreative 2012 workshop track III: interactive text mining task.
Arighi CN, Carterette B, K.B. Cohen KB, Krallinger M, Wilbur MJ, Fey P, Dodson R, Cooper L, Van Slyke CE, Dahdul W, Mabee P, Li D, Harris B, Gillespie M, Jiminez S, Roberts P, Matthews L, Becker K, Drabkin H, Bello S, Licata L, Chatraryamontri A, Schaeffer ML, Park J, Haendel M, Van Auken KM, Li Y, Chan J, Müller HM, Cui HM, Balhoff JP, Wu CY, Lu Z, Wei CH, Tudor CO, Raja K, Subramani S, Natarajan J, Cejuela JM, Dubey P, and Wu C.
Database (Oxford). 2013 Jan 17;2013:bas056.

Textmining in the biocuration workflow: application for literature curation at WormBase, dictyBase, and TAIR.
Van Auken KM, Fey P, Berardini TZ, Dodson R, Cooper L, Li D, Chan J, Li Y, Basu S, Müller HM, Chisolm R, Huala E, Sternberg PW, and the WormBase Consortium.
Database (Oxford). 2012 Nov 17;2012:bas040.

A hybrid human and machine resource curation pipeline for the Neuroscience Information Framework.
Bandrowski AE, Cachat J, Li Y, Müller HM, Sternberg PW, Ciccarese P, Clark T, Marenco L, Wang R, Astakhov V, Grethe JS, Martone ME.
Database (Oxford). 2012 Mar 20;2012:bas005.

Toward an interactive article: integrating journals and biological databases.
Rangarajan A, Schedl T, Yook K, Chan J, Haenel S, Otis L, Faelten S, DePellegrin-Connelly T, Isaacson R, Skrzypek MS, Marygold SJ, Stefancsik R, Cherry JM, Sternberg PW, Müller HM.
BMC Bioinformatics. 2011 May 19;12:175.

Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation.
Van Auken K, Jaffery J, Chan J, Müller HM, Sternberg PW.
BMC Bioinformatics. 2009 Jul 21;10:228.

Textpresso for Neuroscience: Searching the Full Text of Thousands of Neuroscience Research Papers
Müller HM, Rangarajan A, Teal TK, Sternberg PW.
Neuroinformatics. 2008 Sep;6(3):195-204. Epub 2008 Oct 24.

Automatic document classification of biological literature
Chen D, Müller HM, Sternberg PW.
BMC Bioinformatics. 2006 Aug 7;7:370.

Download Textpresso Software

Downloads are available for the new and the old system. Follow the links.

New Old

About Textpresso

Textpresso was initially developed by Hans-Michael Müller, Eimear Kenny and Paul W. Sternberg, with contributions from Juancarlos Chan, David Chen, Arun Rangarajan and Tracy K. Teal, Yuling Li and James Done.

Textpresso Central, the new curation platform, was originally developed by Hans-Michael Müller, Yuling Li, Kimberly Van Auken and Paul W. Sternberg. Current developers are Valerio Arnaboldi and Hans-Michael Müller.

Textpresso is part of WormBase at the California Institute of Technology, California, U.S.A. Over time Textpresso was and is supported by various grants from the National Human Genome Research Institute at the National Institutes of Health.

The site uses various open access ontologies and software libraries. Copyrights are with those who developed them. All other materials on this site are copyrighted by the California Institute of Technology. All materials appearing on this server may not be reproduced or stored in a retrieval system without prior written permission of the publishers and in no case for profit. Documents from this server are provided "AS-IS" without any warranty, expressed or implied.

Any results and output obtained from the TextpressoTM servers shall not be used for any purpose other than private study, scholarship or academic research. Anybody using Textpresso in excess of "fair use" may be liable for copyright infringement.

Hans-Michael Müller and Paul W. Sternberg claim all rights in the word "Textpresso" as a trademark.