Jump to content

Bioinformatics research project

Recommended Posts

Hi friends,


Calling all bioinformaticians to help me! I have to make a project using the bioinformatics tools. Can anyone please suggest what could be the best topic foe me to pursue at grad studies?

I have to develop a tool(program) to do some basic BI calculation/simulations /interpretations that can achieve a desired level of result. I have already done a project on the analysis of Microarray data, so I am inclined towards Proteomics/Protein Chemistry, but is open to any good research topic.


Can some people help me describe some good projects that could be done(at grad lvl). Please give relevant details on the data availability and conclusions that could be drawn at the end with some reference to the procedure/algorithm adopted. I will be grateful to all and he/she will definitely get a mention in the final project/ thesis.

Some broad areas that I am aware of are :

  • Microarray data analysis
  • Comparing genomic sequences using Dotplots
  • Computational Evolutinary genomics
  • Protein Structure Prediction etc.


Please help me out with as diverse projects as possible and it would be great if anyone could help me with a current/live project.

Link to comment
Share on other sites

a very challenging project for you to consider...


given the distances between the Calpha atoms of a protein and the nearest solvent boundary, ie Calpha-(nearest-surface-)h2o distances, could you accurately predict the protein structure?

Link to comment
Share on other sites


That was a bit tough to get, can u please discuss more of it. I don't think I have got what u really wanted to put through.

let me tell what I got from that, we want to predict the tertiary struc when the protein is in solvent(water) and the c-alpha and nearest H20 distances are known. Does it mean that we have to gather the water and C interactions and steric energy and get the tert struc in a minimum energy level in solvent??

if so, can u please exemplify it more, or if not then please put forward in very basic terminology, so that it makes more sense to me.

Link to comment
Share on other sites

ok, first let's get some experimental data.


we have a protein in solution to which we add D2O.

over time the deuterium/D protons will replace the hydrogen/H protons of the protein.

then at a certain time we stop the exchange of D and H, ie quench the reaction.

following this we analyse the protein by peptide mass spectroscopy.

this analysis will reveal sites of H-D exchange.


now by looking at the kinetics of H-D exchange on the protein, ie performing a time course, it is possible to identify which protein residues are most solvent exposed.


the question is, with this data, which before i simply called Ca-solvent distances, would it be possible to predict the protein structure?

Link to comment
Share on other sites

Matt, I have gone thru this exhaustive article(on the link).I have understood most part of the paper, but am unable to decide where to start and how will the initial data look like(data, values, format of data, file format etc). Also, I have looked at the equations discussed, where by eq(6) is the most appropriate and will give the proton-deuterium exchange. This gives us which AA are around the boundary and which are inside the core of protein.

But we could also get the hydrophobic and hydrophilic AA from the seuence itself and move on those lines? But how will this value help us predict the structure of protein? The higher value depicts the alpha-helix, but this is only for Cyt c and how can this be generalised ?(what threshold value shud be taken?)

Plz help me further on this. I am really looking forward to this project now.

Link to comment
Share on other sites

hi, i reread your initial post

I have to develop a tool(program) to do some basic BI calculation/simulations /interpretations that can achieve a desired level of result.


as you wrote, knowing where to start is very important. also important is realising what is achievable with all your resources and limitations. i have no idea on the later. but where to start? break the problem into steps.

a) obtain data (this may be invented or borrowed from a paper. remember that the data will come from the same experiment repeated many times.)

b) prepare data for analysis (this may involve going from residue H-D exchange times into probability distributions that a residue is at the surface)

c) analyse data (does the data make sense? ie, have some expected distribution for what the data should look like. perhaps this kind of analysis may suggest that the protein has multiple domains? can you detect correlations in exchange times between adjacent residues?)

d) build model (algorithms already exist for this, but they will have limitations. perhaps focus on a small part of the algorithm. or do you want to try to incorporate other data, ie protein secondary structure predictions?)

e) display model and stats analysis


in my opinion if you tackled any of the steps a) to e) this would be a good project, or focus even further and tackle a small part of one of the steps.


the main reason why i suggested this kind of project to you was that i thought it would be educational. think of the challenge! you are going from one-dimensional information to a 3D model. but if you want you could simplify it, go from 1D to a 2D protein. admittedly 2D proteins don't exist but you can invent them (or maybe they have already been invented). in my experience, thinking along these lines, 1D to 2D, is better for testing ideas and easier for programming.


it is a project where there is plenty of room for imagination, more so as your use of statistics and probability advances.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.