cazmantis Posted June 30, 2010 Share Posted June 30, 2010 Hiya, I am currently BLASTING genomic data in FASTA format between species to try and identify a conserved locus. I have obtained quite a good list of BLAST results, but what I am specifically looking for are matches in the flanking regions of the gene to make sure they are a product of the same locus. I have been looking into this for a week or so and my progress is...slow to say the least! I can blast the gene which specific mRNA of interest is derived from (I am using genomic DNA), and of course once I obtain a result I can see where the matches are in the nucleotide sequences. I am just really stuck at identifying where in the piece of genomic DNA I'm looking at the flanking region is! I am aware that it will be at the 5' end and will contain elements such as signal peptide and TATA box, but my lack of experience is really holding me back on figuring this out. Could anyone help me out with any pointers? Thanks so much! Caz Link to comment Share on other sites More sharing options...
CharonY Posted July 1, 2010 Share Posted July 1, 2010 I am not quite sure what you try to achieve, but if genomic sequences are available for your species of interest and you know the gene (and by extension, the position in the genome of the relative gene) you can extract the respective up-and downstream region of each species and align them. I would go for multiple sequence alignments (e.g. clustalw) in order to see similiarities/dissimilarties between the respective loci. Link to comment Share on other sites More sharing options...
cazmantis Posted July 1, 2010 Author Share Posted July 1, 2010 Hi Charon, thanks so much for your swift reply. I have just a question or two from what you have said below (many apologies - I'm new at this and getting myself into a real muddle!) Finding the position in the genome of the gene I am looking at is proving difficult. I am using mainly Apis and Nasonia sequences so I have full genomes for them but I'm not sure how one quantifies "the position in the genome". Are we just talking about which chromosone and locus the gene is located on or should I try and be a little more precise? I think I have been working on the lines of what you suggest below. I do have some confusion about which part of the gene constitutes the "flanking region" and also how I extract the upstream and downstream regions in NCBI database? I will continue my reading but any help would be greatly appreciated! Best, Caz Link to comment Share on other sites More sharing options...
MedGen Posted July 2, 2010 Share Posted July 2, 2010 Hi Charon, thanks so much for your swift reply. I have just a question or two from what you have said below (many apologies - I'm new at this and getting myself into a real muddle!) Finding the position in the genome of the gene I am looking at is proving difficult. I am using mainly Apis and Nasonia sequences so I have full genomes for them but I'm not sure how one quantifies "the position in the genome". Are we just talking about which chromosone and locus the gene is located on or should I try and be a little more precise? If you go to UCSC Genome Browser, you can perform a BLAT search which will align your query sequence against their reference builds, and thus provide you with a genomic location for your sequences (provided those areas have been covered and mapped correctly). I think I have been working on the lines of what you suggest below. I do have some confusion about which part of the gene constitutes the "flanking region" and also how I extract the upstream and downstream regions in NCBI database? I will continue my reading but any help would be greatly appreciated! Best, Caz The best place to start is with your +1 position and find out where the 5'UTR begins. There are a number of programs that can be used to predict the positions of promoters and regulatory elements: http://www.gene-regulation.com/pub/programs.html http://alggen.lsi.upc.es/cgi-bin/promo_v3/promo/promoinit.cgi?dirDB=TF_8.3 http://bip.weizmann.ac.il/toolbox/seq_analysis/promoters.html To extract the upstream and downstream sequences from NCBI, ideally you need homologous sequences or use the whole genome shotgun (WGS) and genomic sequence options when you are performing a BLAST query and extract the entire contig, not just the homologous regions. The better way to do it is to use the aforementioned UCSC genome browser which allows you to define the distance of upstream sequences you want to extract. Link to comment Share on other sites More sharing options...
CharonY Posted July 2, 2010 Share Posted July 2, 2010 IIRC the UCSC Genome Browser had only a limited amount of species (maybe they updated it by now, I am not sure). One simple way to extract the sequence is, simply to go into the genome browser from NCBI for the respective organism, note the start and end position and just export a few hundred bp up and down from there. I think they updated quite a lot of stuff and there may now actually be an easier way to do so. I generally used download the sequences and used custom tools to play around with it. Of course there is also the possibility to download the sequences and open it up in one of the freeware genome browser (e.g. artemis) and extract your stuff from there. It really depends what you want to do. Just a thought, if you want to compare the whole regions of the organisms, you may also want to look e.g. at synteny. Link to comment Share on other sites More sharing options...
MedGen Posted July 3, 2010 Share Posted July 3, 2010 True, UCSC has its limitations, but I think Apis mellifera is definitely available, not sure about Nasonia though. Link to comment Share on other sites More sharing options...
cazmantis Posted July 3, 2010 Author Share Posted July 3, 2010 Hi! Thanks for the replies, that's just great. Shortly after I posted I managed to figure out how to use the sequence viewer in NCBI a bit better and managed to export my gene plus the flanking regions for analysis in Clustal - but it's great to have the confirmation that I'm doing it right I think I have made way too much allowance for my flanking regions though - as per your suggestion I will cut it down to a few hundred base pairs. I'll have a look at the software which predicts placement of promotors too - if nothing else it will be interestingt o see what results I obtain. Thanks so much! Until next time Caz Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now