Sequence and Analysis
From RadishDB
Revision as of 12:48, 23 March 2009; view current revision
←Older revision | Newer revision→
←Older revision | Newer revision→
Contents |
Sequences
- Sequences:1st Batch:First 5,000 clones, EST contigs
- Sequences:All: 25,000 clones from each library, EST contigs
Linkage map
Preliminary linkage map for radish (Raphanus raphanistrum).
Markers
SSR
- New York population of Raphanus raphanistrum, tested and variable
- New York population of Raphanus raphanistrum, tested with little variation
- Spreadsheet for primer tested: Updated 23 March 2009 (EDT)
SNP
Analysis Notes
Novel coding sequence
- The un-annotated regions in eukaryote genomes are found to be expressed in several tiling array studies. In addition, it is generally more difficult to predict small protein genes and many have been identified through experiments but not computational predictions.
- As a proof of concept, we have identified >900 small open reading frames that are highly likely to be real genes in the intergenic regions of the Arabidopsis thaliana genome. The work is done mainly through the support of the Radish transcriptome sequencing grant and has been published in Genome Research.
EST assembly
The contigs you can download in the are generated by the JCVI plant TA pipline. Assembling were taking place from each library and in following ways:
- Use seqclean to clean all the est sequences which include discarding sequences shorter that 100bp, seraching internal UniVec database and screening out any vector sequences, using DUST to get rid of low complexity sequences.
- Use the TGICL pipeline to generate contigs with the following parameters:
- Minimum percent identity for overlaps (PID):95
- Miminum overlap length: 50
- Maximum length of unmatched overhangs:20
