by Brian | 28th October 2010
When we calculate our E values we are assuming that This is great example of the type of histogram on which the Fenyo method of calculating E values performs poorly. See that one tall histogram bar? As that falls in the small section where the least-square line is calculated what happens is a much steeper [...] Read More
by Brian | 20th August 2010
Here’s a kind of nifty animation of how TandemFit walks through each MS/MS peak twice (once forwards, once backwards) to find the matches to the theoretical ions of a peptide. Note that the ion being searched for is displayed. The animation! Read More
by Brian | 19th August 2010
It looks like spectrum: T10475_Well_A13_2025.07_16898.mgf..pkl and spectrum: T10475_Well_A13_2025.07_17096.mgf..pkl are the same. This is bad news for me as TandemFit gets both of those “wrong”. The quotes because TandemFit’s match of QAGLQLQESLEPAVRLDR has 11 fragment alignments vs 2 produced by VPAPSIEDICHVLSTVCK which is the “correct” peptide. Update: other duplicates: T10475_Well_A12_1386.68_16898.mgf..pkl T10475_Well_A12_1386.68_17096.mgf..pkl T10475_Well_A03_1551.77_16898.mgf..pkl T10475_Well_A03_1551.77_17096.mgf..pkl T10475_Well_A11_1386.69_16898.mgf..pkl T10475_Well_A11_1386.69_17096.mgf..pkl T10475_Well_A10_1188.45_17096.mgf..pkl [...] Read More
by Brian | 28th June 2010
We were recently given by the Chen lab a set of spectra that were derived from the USP set of 50 proteins. We were told that there are probably a good amount of impurities in this set so some of the spectra will not correspond to peptides which can be found in the USP list. [...] Read More
by Brian | 24th June 2010
Just added a method and modified constructor to the Spectrum object which allows for normalization of the peaks intensities. It finds the maximum intensity for a given spectrum and then walks through each peak dividing the intensity by this found maximum intensity. This can be handy for SpectrumMatch as well as for comparing scores TandemFit [...] Read More
by Brian | 2nd December 2009
To judge the possible effectiveness of comparing spectra based on a hash of their peaks I wrote a program to take a list of spectra and convert them to hashes. From that set of spectra there were two pairs of sibling spectra that should be close matches as I am confident that they are derived [...] Read More
by Brian | 1st December 2009
Note to Brian: try simply summing the intensities of peaks that come within a certain delta of a frag mass. Then divide that sum by Max(peak intensities). This will often grab more than one peak for a fragment, but this may produce good results as often multiple peaks cluster closely around ions. Read More
by Brian | 19th November 2009
in MSMSFit, differenceThreshold could be a function of peak intensity. That is, there can be more leeway for greater intensity peaks. Read More
by Brian | 21st October 2009
Fragments Report Null modifications Carboxymethyl (C) Acetyl (N-term) Carbamidomethyl (C) Oxidation (M) Phospho (ST) Or in bar form…. Null modifications Carboxymethyl (C) Acetyl (N-term) Carbamidomethyl (C) Oxidation (M) Phospho (ST) Read More
by Brian | 9th October 2009
“Tandem mass data hash,” try to say that five times fast. To get a better feel for how MS/MS data will be represented as a hash I wrote a quick visualizer. How the MSMS data is hashed: an NxM bin matrix is computed where N is the number of bins to store mass and M [...] Read More
by Brian | 9th October 2009
This was confusing the hell out of me. I’d have an fprintf call that would say something like “loading spectra…” and a breakpoint before that and in the console it would still print “loading spectra…” and continue on its merry way like the breakpoint didn’t even exist. So what’s the real problem? I don’t know [...] Read More
by Brian | 18th September 2009
by Brian | 26th August 2009
Jainab and I have discussed this before, but I want to get it down: Taking your peak list, sorting by intensity, then selecting the top X of them can be detrimental for this reason: Some locations where fragments are very likely to occur and with high intensities can act as intensity hogs. They will be [...] Read More
by Brian | 18th August 2009
For my reference more than for yours: frenchbroad:gfs.git risk2$ git cvsimport -v -d :local:/Volumes/LabShare/cvsroot GFS-Vec-V2 Initialized empty Git repository in /Users/risk2/Documents/gfs.git/.git/ Running cvsps... cvs_direct initialized to CVSROOT /Volumes/LabShare/cvsroot cvs [rlog aborted]: -t/-f wrappers not supported by this version of CVS A legacy version of cvs with -t/-f wrapper support is available as: /usr/bin/ocvs. Read More
by Brian | 17th August 2009
You probably don’t need to. If you have the Git tool installed it is probably one of the tools that comes with it. To check try this on the command line: git help -a | grep cvsimport I got that from this site on how to import CVS into Git. The official Git page on [...] Read More
by Brian | 28th July 2009
The data is from OutputFile2.pkl. The x-axis is increments of 10,000. The final bar represents all peaks with intensities of 20,000 and above. Read More
by Brian | 25th July 2009
The following are the results from a small experiment testing the hypothesis that combining the spectral data for (probably) matching poly-peptides yields a higher HMM_Score. Running a certain PKL (to me known as “output2.pkl”) on chr4 with GFS we see that two of the returned sequences pop up twice with relatively high HMM_Scores. This strongly [...] Read More
by Brian | 16th July 2009
Here’s the opendiff view of a file before (left) and after (right) peaks have been cleaned. UPDATE: Examples of how cleaning may cut out some ion matches. Without cleaning: peaks ion matches precursor mass sequence 208 11 1544.112183 HGTDDGVVWMNWK 324 10 1544.512207 HGTDDGVVWMNWK 324 10 1544.512207 HGTDDGVVWMNWK 267 10 1818.752197 TMTIHNGMFFSTYDR 224 10 2125.862305 HQLYIDETVNSNIPTNLR [...] Read More
by Brian | 8th July 2009
The good news is that preliminary performance of my MSE algorithms is FAST… like, doesn’t even add a second to overall performance. If I can get some good correlation between HMMScore and MSE (especially low scoring HMMScore) then it could be a really good filter. Right now there is not good correlation, but I feel [...] Read More
by Brian | 7th July 2009
I modified the code to have more precise acid weights and I added frag17 (which subtracts a nitrogen and three hydrogen) and frag18 (which subtracts an oxygen and two hydrogen). For this scale frag17 and frag18 values are so very close that they overlap. Still, you can see that for the correct acid sequence there [...] Read More