Skip to content. | Skip to navigation

Personal tools
Sections
You are here: Home Science posters

posters

The following posters relate to Insilicos' research and scientific direction.

MR-Tandem: Parallel X!Tandem Using Hadoop MapReduce on Amazon Web Services

Brian Pratt, J. Jeffry Howbert, Natalie Tasman, Erik Nilsson

MR-Tandem adapts the popular X!Tandem peptide search engine to work in the Amazon Web Services cloud so that investigators can easily and inexpensively create their own temporary compute clusters on demand, and quickly conduct what would otherwise be excessively time consuming searches.

MR-Tandem isn’t the only parallel implementation of X!Tandem. Standard X!Tandem already supports threaded execution, which is useful on multicore processors. The X!!Tandem project uses MPI to spread that threading model across multiple compute nodes, and MR-Tandem uses Hadoop to do the same thing. All three will yield identical results as long as the thread/node counts are the same.

The problem with X!!Tandem is that MPI is notoriously brittle as cluster size goes up. Specialized hardware is needed to satisfy MPI’s intolerance of network latency, and a failure anywhere in the MPI ring results in the entire search failing. In practice we find that it is very difficult to instantiate an MPI ring of more than about 10 nodes on AWS.

MR-Tandem’s use of Hadoop puts fault tolerance is at the core of its design, and MR-Tandem will happily run on commodity hardware. If a worker node should fail for any reason, its workload is transparently reassigned and the search will be completed despite the failure.

Presented at the 2011 Cascade Proteomics Symposium. Poster: [PDF] [PPT]


Comparison of Methods of Applying the Trans-Proteomic Pipeline to Mascot Peptide Matches

Bryan Prazen

The Trans-Proteomic Pipeline (TPP) is an open source proteomics data analysis pipeline introduced by the Aebersold group at the Institute for Systems Biology. This pipeline includes tools for quantitation and statistical validation of protein identifications from tandem mass spectrometry experiments. While TPP is extremely useful and well received, many users struggle with the application of this data analysis pipeline to Mascot search results. Most of the pitfalls encountered while applying the TPP to Mascot data stem from file format converters. There are multiple data processing paths available for the analysis of Mascot search results, some of which lead to more comprehensive protein identifications than other methods. Recently, Insilicos has issued a version of the TPP that is known as the Insilicos Trans-Proteomics Pipeline (IPP). This version of the pipeline allows for analysis using MS/MS data that went through Mascot Distiller or DTASuperCharger. Previous versions of the proteomic pipeline required that MS/MS data from some instruments be converted to mzXML or mzDATA prior to searching with Mascot. Presented are results of LC-MS/ MS experiments in which a number of data analysis paths are explained and compared. Presented at US HUPO, March 2007. Poster: [PDF]


A Fast, Robust and Reliable Data Analysis Pipeline to Produce Statistically Valid Protein Identifications and Quantitation Results from Tandem MS

Bryan Prazen

The Insilicos Proteomics Pipeline (IPP) includes modules for validation of database search results, quantitation of isotopically labeled samples and other proteomics functions. IPP is based on the Trans-Proteomic Pipeline (TPP), a popular open source toolkit for proteomics data processing, introduced by the Aebersold group at the Institute for Systems Biology (ISB). Presented at HUPO, October 2006. Poster: [PDF]


Recent Advances in Reliability, Performance and Usability of the Trans-Proteomic Pipeline (TPP) Software Tools

Brian Pratt

The Insilicos Proteomics Pipeline (IPP) is a proprietary software layer around the TPP, which utilizes the core TPP logic while boosting performance dramatically. Insilicos has thus developed the IPP, a performance enhanced version of the TPP which is significantly faster, more robust, and easier to install and use. Insilicos continues to be an active contributor to the open source TPP code. Poster: [PDF]


Instrument Specific Calibration of PeptideProphet

Bryan Prazen, Erik Nilsson, Brian Pratt, Martin Sadilek, Daniel Martin, John Klimek, Andrew Gemmill, Laura Hohmann, Jennifer Jackson

PeptideProphet was first calibrated using the ThermoFinnigan LCQ mass spectrometer. For this instrument it was shown that PeptideProphet computed probabilities have more correct matches for any given error rate than the conventional filtering criteria. Yet, little is known about how PeptideProphet performs on other instruments. We evaluate the need for instrument specific calibration of PeptideProphet when using two ThermoFinnigan LCQ instruments, a ThermoFinnigan LTQ instrument and an Applied Biosystems API QSTAR Pulsar Instrument. Poster: [PDF]


A Novel Visualization Tool for Common Mass Spectrometric File Formats

Erik Nilsson, Brian Pratt, Bryan Prazen

Each mass spectrometer manufacturer stores data in a unique file format, some of which are proprietary. Proprietary formats limit the ability to compare results between research groups or even between instruments maintained by the same group. Such formats also discourage the exchange of raw data files and the creation of data repositories. mzXML and mzDATA are two common file formats developed to allow for the exchange of MS data. Presented here is a fast and novel visualization tool for mzXML and mzDATA files, known as InsilicosViewer. Poster: [PDF]