Jump to content

How useful are predicted sequences in NCBI?


cazmantis
 Share

Recommended Posts

Bit new to bioinformatics and just wondered how useful PREDICTED amino acid sequences derived from the genome are (I have found several in NCBI). If trying to BLAST these sequences to find conserved sequences, are the results going to be of use or is it better to not bother with these sequences at all?

 

Thanks,

 

Caz

Link to comment
Share on other sites

Well of course there are problems knowing whether you've got the right open reading frame, as well as knowing whether you're actually in a gene or not. But other than that, i have never had any problem (I only have limited experience though).

Link to comment
Share on other sites

It depends on what you are looking for and what kind of database you use.

Sequences for already well characterized proteins tend to be useful in most cases. However due to the automated pipelines that are used nowadays errors could still be there. Swissprot, for instance is a better curated database, yet with overall fewer sequences.

As a rule of thumb reality checking with well-characterized reference genomes are helpful, especially with regards to functional assignments.

But again, it really depends on what you are looking for (e.g. single protein vs whole genome analyses, intergenic regions etc.)

Link to comment
Share on other sites

Hiya!

 

Wow thanks so much for the replies - very useful stuff so far and has served to illuminate my own lack of knowledge of the subject! I think I am getting a little confused with these things. The PREDICTED sequences I am looking at are nucleotide sequences and I am not sure how I would cross reference that against a genome. It just doesn't seem to make sense to me so I assume that my lack of experience in the field means I don't have access to all the facts! For example NCBI accession number XM_001120951 - it says this nucleotide sequence has been predicted from the genomic sequence. I am finding this quite confusing as surely the nucleotide exists or it doesn't. I want to understand how these predicted sequences are different to (let's say) "normal" sequences.

 

I understand this may be a little in depth to expect an answer on but if anyone could perhaps reccomend a book which may cover this aspect of genomes that would be just as helpful for me.

 

Thanks so much for your help,

 

Caroline

Link to comment
Share on other sites

Predicted does not mean that the sequence is predicted (it has been sequenced) but that it has been predicted to be an open reading frame. This topic should be covered by most molecular genetics text books (e.g. Genes).

Again, the function of a locus is predicted but the sequence itself is based on data (though depending on source it may be faulty, but that is another issue).

Link to comment
Share on other sites

  • 11 years later...

Good day CharonY.

Following what cazmantis asked you, I want to also know if it's appropriate for me to design primers using 'predicted' nucleotide sequences, obtained from NCBI.

(Am carrying out a project, and it involves primer design)

Please reply soon CharonY.

Thanks

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.