Brainstorm: massive “crowdsourcing” of proteomic processing

by Brian | 3rd November 2009

Similar in principle to the various folding@home projects. It may seem that GFS may not be conducive to such segmentation as genomic data is quite large making download times prohibitive.

This is not the case if a node “specializes” in a subset of the genome. A node only needs access to a spectrum file and all polypetides with a similar theoretical mass. That means that downloads won’t necessarily have to be very large to download a full section of the problem. A node could, for example, specialize in finding matches for polypeptides of 1000Da to 1005Da. That node has to download once all in-silico digested fragments of within that range of masses. After that, the node waits to be called upon by the server which parses out spectrum files to appropriate nodes. If the server reaches a spectrum file which has a precursor mass of 1003Da then it may decide to send the spectrum to this node.

Leave a Reply

Name (Required)

Email (Required - will not be published)

Website

Message (Required)