Below are some papers that you should read. To be clear - some of these papers may describe tools that no longer exist, but the concepts of the papers are the foundation for most subsequent research.
- Mass spectrometry-based proteomics (pubmed)
- How do shotgun proteomics algorithms identify proteins (pubmed)
- Computational mass spectrometry-based proteomics (pubmed)
- Sequest and database matching (link). Good explanation of matching candidate sequences to a spectrum.
- Dancik paper (pubmed) This paper puts forth basic concepts to understand and explore spectra: the offset frequency function, self-convolution, spectra represented as graphs, de novo sequencing.
- PeptideProphet (pubmed) - Identifying true from false matches is a rigorous and statistically justified way.
- Decoy Databases (pubmed) – An abstraction of the concepts from the PeptideProphet paper. This method has become popular due to the easy implementation and clear concept.
- Comprehensive review (pubmed)
- The Protein Inference Problem (pubmed) - this paper describes why bottom-up proteomics has difficulty in unambiguously identifying proteins.
- Parsimony (pubmed) - one of the frequently used criteria to help roll-up peptides into proteins.
- Matching MS1 features across datasets (pubmed) - This paper describes matching MS1 features by accurate mass and retention time. The popular MaxQuant match-between-runs and all other related techniques are a reimplementation of this original method.
- Isobaric labeling (pubmed) - this paper describes how to multiplex different experiments into one run using a special labeling technique.