Bioinformatics bites: Navigating BLAST results

In this week’s bioinformatics Bite will pick up where last week’s post left off: the results page after you hit the BLAST button.

The results page has several sections that we will go through individually:

  1. Search details: What you entered into your query
  2. Graphic Summary: A visual representation comparing your query to the results
  3. Descriptions: Details about what the results are and how well they match the Query
  4. Alignments: The actual alignment of your query with each result (Subject)

Search details

BLAST search details

(Click to enlarge)

The search details are essentially a recap of what parameters you entered into your search (name of your search, molecule type, query length, database name, the algorithm/program that you ran) but it also gives your search results a unique ID called an RID. This is a temporary link to get back to these search results just in case your computer crashes or you close the window, etc. You can get back to them by going to the BLAST main page and clicking on the gray “Recent Results” tab. It also creates a temporary query ID. If you prefer a video tutorial, also notice that there’s a link to a YouTube video about how to read the BLAST result page. Note that this section is also where you can edit and save your search.

Graphic Summary

BLASTGraphicSummary

The Graphic Summary visualizes how the search results align with your query sequence. The colored boxes at the top are a key for the alignment scores for each search results. Higher numbers (red) are better. The thick red bar under the color key represents the Query. The thinner lines here are how the search results align with the query. The first (best) matches align the while length of the query, while the last few are missing some base pairs at the end.

If you click on one of the red results bars, the screen will jump to the actual alignment for that result and your query, but first let’s look at the Descriptions.

Descriptions

BLAST result descriptions

The Descriptions section contains a table with several informative columns

  1. Description: how the result is annotated in the database
  2. Max Score: alignment score for the best matched segment
  3. Total Score: alignment score for the whole result
  4. Query Cover: how much of your query is included in the alignment
  5. E value: “expect value”, probability of a false positive
  6. Identity: %nucleotides that are identical between query and result
  7. Accession: the unique identifier for the result sequence (with link)

The max and total scores (2&3) refer to similarity scores, which are a measure of how well the query and subject match. Usually the score is calculated by adding points for all bases that match and subtracting points for mismatches and gaps. I will cover Similarity scores in more depth in another post about algorithm parameters. As you can probably tell, if the subject and query match reasonably well, then the longer the sequence, the higher the score. This fact means that you can’t compare similarity scores between different BLAST queries. It a way of ranking results within one search.

Query cover is the % of the query that is matched by the subject. In this case, all but the last 4 are 100% because they cover the whole query.

The E value, aka the “expect value”, is the number of matches you’d expect to get by random chance in a given database/query combo, or a false positive. Because BLAST is meant to get at evolutionary relationships among sequences, another way of explaining an E value is the likelihood that you got this search result even though the subject and the query are not evolutionarily related given your BLAST query.

The identity is the % nucleotides that are exact matches between the subject and the query.

Finally, the Accession number is the unique identifier for the search result, with a link. This link maps to the entire sequence, not just the part that matches. For example, it would pull up the entire contig for a genomic hit or a whole transcript for an mRNA hit.

If you click on the description for a hit, it will move the page down to the alignment.

Alignments

BLAST AlignmentThe alignment lines up the query (top) and the subject (bottom) base by base along the whole length of the query. Vertical lines indicate exact matches. Horizontal lines indicates gaps in the sequence. The numbers on the sides of the alignment refer to the base position of each sequence.

One of the most useful features here is the related information section on the right. This will have links to other NCBI databases like Gene where you can find more information about a given search result.

Hope this was useful. We’re not done with BLAST yet though. Upcoming posts will discuss filtering your search results, adjusting algorithm parameters, saving BLAST searches, and creating custom search databases.

-Tobin Magle, Biomedical Sciences Research support specialist.

NCBI is innovating: give them your feedback

NCBI is currently going through a period of innovation. To facilitate testing new features, they have created PubMed Labs. They are releasing new features in PubMed and the NCBI databases and soliciting feedback before they become permanent features. Two new features are currently available:

  1. PubMed Also-Viewed: Much like the “Customers Who Bought This Item Also Bought” feature on Amazon, this new feature uses anonymous user data to identify publications that are viewed together by the same user. You can find this under the Related Information section on the right of an article results page.
  2. SmartBLAST: This feature automatically sorts BLAST results based on sequence quality and creates a multiple sequence alignment of the 5 closest hits. This information is also displayed in a phylogenetic tree. You can find this feature  at http://blast.ncbi.nlm.nih.gov/smartblast/.

You can provide feedback to NCBI by commenting on the blog posts for these features.

Keep an eye on the NCBI blog for more new features in the PubMed Labs category.

– Tobin Magle, Biomedical Sciences Research Support Specialist

Upcoming resource downtimes: ScienceDirect, Embase, ClinicalKey, ProQuest

1 August, 2015 (Saturday): Access to Elsevier’s Science Direct and Embase will be unavailable from 4:00 PM MST to 8:30 PM MST.

1 August, 2015 (Saturday): Access to ClinicalKey’s login and authentication services will be unavailable. ClinicalKey users will be unable to utilize any features that require login, such as the Presentation Maker, Saved Content, Search History, and Manage Credits services. However, all ClinicalKey users will still be able to access, search and browse content on ClinicalKey without logging in. Downtime is scheduled for 4:00 PM MST with an estimated end time of 9:30 PM MST.

8 August, 2015 (Saturday): ProQuest’s databases, including Dissertations and Theses, will be unavailable due to a system upgrade. The window is scheduled to begin at 8:00 PM MST to 4:00 AM MST.

Librarians in the Lab: Gates Biomanufacturing Facility

This is Tobin Magle, Biomedical Sciences Research Support specialist at the Health Sciences Library. I’m starting another blog series called “Librarians in the Lab” where I and other health sciences librarians visit labs on campus. This interaction will help us understand the type of work being done on campus, and give us some face time researchers that we serve so that they know better what we do. If you would like us to visit your lab, please contact tobin.magle@ucdenver.edu!

(Courtesy of Brad Kubick, BS and Dennis Roop, PhD): Live imaging of cancer stem cells (green) evading immune cells (red) in a genetically engineered mouse model of skin cancer.  In this model, skin cancers initiate around hair follicles (blue).  This model may reveal how cancer stem cells avoid immune detection and suggest new therapeutic strategies to reverse this process.

Cancer stem cells evade immune detection. (Courtesy of Brad Kubick, BS and Dennis Roop, PhD): Live imaging of cancer stem cells (green) evading immune cells (red) in a genetically engineered mouse model of skin cancer. In this model, skin cancers initiate around hair follicles (blue). This model may reveal how cancer stem cells avoid immune detection and suggest new therapeutic strategies to reverse this proces

Last week, research librarian Lilian Hoffecker and I visited the Gates Biomanufacturing Facility (GBF) located in Biosciences Park Center on Montview that opened in April. This facility allows the production of both cell therapies and biologics and is the only one of this caliber in an 800-mile radius. There are 25 other academic facilities that follow Good Manufacturing Practices (GMP) in the United States, only 5 of which are comparable to the quality and functionality of GBF, making GBF both geographically and functionally unique.

Having this great resource on campus would not have been possible without the contributions of the late Charles C. Gates, a local engineer, entrepreneur, and stem cell visionary. Health problems late in his life inspired him to fund translational medicine.

Generating human red blood cells (Courtesy of Greg Bird, PhD, Brian Turner, PhD and Yosef Refaeli, PhD):  A major technological breakthrough now allows the expansion of human blood stem cells in the laboratory.  The expanded human blood stem cells can be differentiated into erythroid progenitor cells (large pink cells with a nucleus (purple)) which give rise to mature red blood cells (small red cells).  This technology makes it feasible to generate an unlimited supply of pathogen free human blood.

Generating human red blood cells (Courtesy of Greg Bird, PhD, Brian Turner, PhD and Yosef Refaeli, PhD): A major technological breakthrough now allows the expansion of human blood stem cells in the laboratory. The expanded human blood stem cells can be differentiated into erythroid progenitor cells (large pink cells with a nucleus (purple)) which give rise to mature red blood cells (small red cells). This technology makes it feasible to generate an unlimited supply of pathogen free human blood.

A generous donation from his foundation allowed the creation of the Charles C. Gates Center for Regenerative Medicine. This center runs 3 core laboratories, including the GMP facility. In Gate’s entrepreneurial spirit, the center focuses on getting discoveries made on campus into hospitals and clinics, which requires the services that GBF provides. This facility meets FDA safety regulations for human use. The environment inside the processing labs contains 1000x less particulates in than regular air to make sure the products are safe to use in humans. Additionally, they have implemented robust quality management systems and standard operating procedures to assure the highest quality.

The Gates Biomanufacturing Facility addresses two very hot topics in biomedical research: translational research and personalized medicine. The research coming into GBF has already been tested in animal models, but needs to clear strict quality hurdles to be tested in humans. The ultraclean environment and strict reporting processes at GBF allows treatments that were successful in animal models to be translated to the clinic and scale up their production to allow them to run phase 1 clinical trials on both cell and protein products (biologics).

The developing mouse eye (Courtesy of Tatiana Eliseeva, BS and Joe Brzezinski, PhD):   Retinal stem cells (green) produce photoreceptors (purple). Photoreceptors die in diseases like age-related macular degeneration.   Photoreceptors derived from patient-specific induced Pluripotent Stem (iPS) cells could be used to treat macular degeneration.

The developing mouse eye (Courtesy of Tatiana Eliseeva, BS and Joe Brzezinski, PhD): Retinal stem cells (green) produce photoreceptors (purple). Photoreceptors die in diseases like age-related macular degeneration. Photoreceptors derived from patient-specific induced Pluripotent Stem (iPS) cells could be used to treat macular degeneration.

The truly amazing feature of GBF’s services is the applications to personalized medicine due to its association with the Gates Center for Regenerative Medicine. Stem cell therapies have been controversial in the past because of the use of embryonic stem cells (ESCs) for ethical reasons. Recent advances in stem cell research have made it possible to reprogram adult skin cells to create ESC-like cells. This strategy is advantageous for two reasons: it removes the ethical constraints around using ESCs and also reduces the risk of the patient’s body rejecting the cells. The GBF allows researchers to make clean cells from patient biopsy samples that can be reintroduced as a treatment. This technology can be used to treat conditions as diverse as macular degeneration, epidermolytic hyperkeratosis, repairing damaged heart tissue, cancer immunotherapies, bone and cartilage regeneration, not to mention producing human blood in the lab.

The services available at GMF are available to campus researchers at cost, which is a major advantage to anyone doing translational research on this campus. This activity will be subsidized by for profit work done in collaboration with biotech startup

Generating skin stem cells (Courtesy of Anya Bilousova, PhD and Dennis Roop, PhD):  Human induced Pluripotent Stem (iPS) cells (pink) can be differentiated into ectodermal cells (green) which subsequently differentiate into skin stem cells. Nuclei are stained blue.  This approach is being used to develop novel therapeutic strategies for inherited skin blistering diseases where patient-specific iPS cells are generated, genetically corrected and differentiated into normal skin stem cells which are then returned to the same patient as an autograft.

Generating skin stem cells (Courtesy of Anya Bilousova, PhD and Dennis Roop, PhD): Human induced Pluripotent Stem (iPS) cells (pink) can be differentiated into ectodermal cells (green) which subsequently differentiate into skin stem cells. Nuclei are stained blue. This approach is being used to develop novel therapeutic strategies for inherited skin blistering diseases where patient-specific iPS cells are generated, genetically corrected and differentiated into normal skin stem cells which are then returned to the same patient as an autograft.

companies. This arrangement is mutually beneficial because it promotes campus research and is significantly more affordable than investigators building their own facility or outsourcing the work. Finally, having this facility on campus is important for recruiting top academic faculty who are interested in translational science and personalized medicine.

We received a tour of the facility and were able to see the cell products and biologics development areas and the quality control facilities. One of the most visually striking aspects of the facility is the wall art depicting some of the cell therapies that will be in development soon at GBF, which are dispersed throughout this post. We’re looking forward to hearing about all of the great discoveries that come out of the GBF. In the mean time, enjoy the images.

Generating dopamine secreting neurons (Courtesy of Wenbo Zhou, PhD and Curt Freed, MD):  Human induced Pluripotent Stem (iPS) cells can be differentiated into human neuronal cells (red nuclei), some of which are dopamine secreting neurons (green cells with yellow nuclei). Following implantation into a rat model of Parkinson’s disease these human cells survive long term. The green fibers are connections of the human dopamine neurons to the rat brain cells. This approach may eventually be used to treat patients with Parkinson’s disease.

Generating dopamine secreting neurons (Courtesy of Wenbo Zhou, PhD and Curt Freed, MD): Human induced Pluripotent Stem (iPS) cells can be differentiated into human neuronal cells (red nuclei), some of which are dopamine secreting neurons (green cells with yellow nuclei). Following implantation into a rat model of Parkinson’s disease these human cells survive long term. The green fibers are connections of the human dopamine neurons to the rat brain cells. This approach may eventually be used to treat patients with Parkinson’s disease.

Gates Biomanufacturing Facility

12635 E Montview Blvd. Suite 380 •

Aurora, CO.80045

Contacts:

Thomas Payne Ph.D, Director of Cell Therapies-thomas.payne@ucdenver.edu -303.724.7779

Patrick Gaines, Business Development – patrick.gaines@ucdenver.edu – 720.281.2100

Timothy Gardner, Business Development – timothy.gardner@ucdenver.edu – 303.724.7049

——————————————————————————————————–

Regenerating Bone (Courtesy of Karin Payne, PhD):  Bone allografts (dark circles) can be revitalized (green staining) using mesenchymal stem cells derived from human induced Pluripotent Stem (iPS) cells.  Revitalized bone allografts can be used to enhance bone fracture repair and improve spine fusion.

Regenerating Bone (Courtesy of Karin Payne, PhD): Bone allografts (dark circles) can be revitalized (green staining) using mesenchymal stem cells derived from human induced Pluripotent Stem (iPS) cells. Revitalized bone allografts can be used to enhance bone fracture repair and improve spine fusion.

Detecting hair follicle stem cells (Courtesy of Stanca Birlea, MD and David Norris, MD):  A cross-section through a human hair follicle revealing the location of multipotent stem cells (green) in a region called “the bulge” which is located just above the site of insertion of the arrector pili muscle (red).  Multipotent stem cells renew hair follicles, sebaceous glands and the epidermis in response to injury.  The bulge is also the location of melanocyte stem cells which can be mobilized to repigment the skin of patients who suffer from vitiligo.

Detecting hair follicle stem cells (Courtesy of Stanca Birlea, MD and David Norris, MD): A cross-section through a human hair follicle revealing the location of multipotent stem cells (green) in a region called “the bulge” which is located just above the site of insertion of the arrector pili muscle (red). Multipotent stem cells renew hair follicles, sebaceous glands and the epidermis in response to injury. The bulge is also the location of melanocyte stem cells which can be mobilized to repigment the skin of patients who suffer from vitiligo.

Generating cardiomyocytes (heart muscle cells) (Courtesy of Kunhua Song, PhD):  Fibroblasts (skin cells) were directly reprogrammed into cardiomyocytes which contain sarcomeres (red) and nuclei (blue).  This approach may reveal novel therapeutic strategies for heart repair.

Generating cardiomyocytes (heart muscle cells) (Courtesy of Kunhua Song, PhD): Fibroblasts (skin cells) were directly reprogrammed into cardiomyocytes which contain sarcomeres (red) and nuclei (blue). This approach may reveal novel therapeutic strategies for heart repair.

Preventing radiation-induced oral mucositis (Courtesy of Xiao-Jing Wang, MD, PhD): Topical delivery of the fusion protein, tat-Smad7 (green), to oral mucosal cells (red) in a mouse results in its efficient uptake into nuclei and prevention of radiation-induced oral mucositis (extensive oral ulcers).   This approach represents a novel therapeutic strategy to treat and prevent oral mucositis which develops in 40-70% of cancer patients receiving chemo- or radiation- therapy.

Preventing radiation-induced oral mucositis (Courtesy of Xiao-Jing Wang, MD, PhD): Topical delivery of the fusion protein, tat-Smad7 (green), to oral mucosal cells (red) in a mouse results in its efficient uptake into nuclei and prevention of radiation-induced oral mucositis (extensive oral ulcers). This approach represents a novel therapeutic strategy to treat and prevent oral mucositis which develops in 40-70% of cancer patients receiving chemo- or radiation- therapy.

(Courtesy of Bruce Appel, PhD):  Neural stem cells (green) produce myelinating glia (red) in the brain of a living zebrafish. This model is being used to screen for new drugs that may stimulate neural stem cells to proliferate and produce new glia in degenerative diseases of the nervous system.

Imaging neural stem cells in zebrafish (Courtesy of Bruce Appel, PhD): Neural stem cells (green) produce myelinating glia (red) in the brain of a living zebrafish. This model is being used to screen for new drugs that may stimulate neural stem cells to proliferate and produce new glia in degenerative diseases of the nervous system.

Generating humanized cancer models, XactMice (Courtesy of Jason Morton, PhD, Greg Bird, PhD, Yosef Refaeli, PhD and Antonio Jimeno, MD, PhD):  Dr. Refaeli’s major technological breakthrough which allows the expansion of human blood stem cells now makes it feasible to generate mice with tumor tissue and blood stem cells from the same patient.  This is an image of human tumor which was excised from a male patient and transplanted onto a mouse whose bone marrow was reconstituted with human blood stem cells from a female.  Staining for X chromosomes (red) and Y chromosomes (green) confirms that the male tumors cells (red and green dots) are infiltrated with female cells (red dots only) which were derived from the human blood stem cells. This model will serve as a new platform to discover new drugs which are directed against the tumor stroma and reversing the tumor’s ability to evade immune detection.

Generating humanized cancer models,XactMice (Courtesy of Jason Morton, PhD, Greg Bird, PhD, Yosef Refaeli, PhD and Antonio Jimeno, MD, PhD): Dr. Refaeli’s major technological breakthrough which allows the expansion of human blood stem cells now makes it feasible to generate mice with tumor tissue and blood stem cells from the same patient. This is an image of human tumor which was excised from a male patient and transplanted onto a mouse whose bone marrow was reconstituted with human blood stem cells from a female. Staining for X chromosomes (red) and Y chromosomes (green) confirms that the male tumors cells (red and green dots) are infiltrated with female cells (red dots only) which were derived from the human blood stem cells. This model will serve as a new platform to discover new drugs which are directed against the tumor stroma and reversing the tumor’s ability to evade immune detection.

Migrating neural crest cells in zebrafish (Courtesy of Kristin Artinger, PhD): An elongated neural crest cell (green) migrating past the notochord (upper dark circle) and somite (red).  Nuclei are stained blue.  Neural crest cells are multipotent stem cells, giving rise to diverse cell lineages including peripheral neurons, glia, pigment cells (melanocytes) and craniofacial cartilage which forms the face. Understanding how neural crest cells differentiate into these different cell lineages may provide insight into the repair and treatment of birth defects such as cleft-lip and other craniofacial syndromes, as well as migration of cancer cells in melanoma.

Migrating neural crest cells in zebrafish (Courtesy of Kristin Artinger, PhD): An elongated neural crest cell (green) migrating past the notochord (upper dark circle) and somite (red). Nuclei are stained blue. Neural crest cells are multipotent stem cells, giving rise to diverse cell lineages including peripheral neurons, glia, pigment cells (melanocytes) and craniofacial cartilage which forms the face. Understanding how neural crest cells differentiate into these different cell lineages may provide insight into the repair and treatment of birth defects such as cleft-lip and other craniofacial syndromes, as well as migration of cancer cells in melanoma.

Bioinformatics bites: Constructing a BLAST query

This week’s Bioinformatics Bite will go though the basics of how to construct a query for NCBI‘s BLAST services.

A previous post discusses how to use text searching to find information about genes. BLAST is another way to search the NCBI databases.

BLAST stands for Basic Local Alignment Search Tool. This tool takes a nucleotide or protein sequence and searches a database of your choice for sequences that have homology, or a shared ancestry, with the sequence that you entered. We’re not going to go into the nuts and bolts of how the algorithm works. Instead we will focus on what the user needs to know to use this tool.

You need 5 pieces of information before you begin your BLAST search.

  1. Query – What are you putting into BLAST?
  2. Subject (aka search result) – What do you want to retrieve from the NCBI databases?
  3. Algorithm – depends on the combination of Query and Subject from above
  4. Database – Can you limit what you search based on what you’re looking for?

Let’s look at these aspects one at a time.

Continue reading

Want to publish in Nature?

The editor from Nature Genetics gives advice for writing titles and abstracts that are worthy of being published in their high impact journals at the 2015 NatureJobs Career Expo.

My favorite tips from the article (paraphrased)

Title:

  • Keep it concise and meaningful, and focus on novel aspects.
  • Be specific, but not too specific (avoid jargon)
  • Don’t tease the reader: make it a statement and not a question
  • And as much as I love them: no puns, because they “are not usually very helpful, lead to fewer citations, and tend to make papers invisible to web searches”.

Abstract:

  • Include keywords to make it more searchable.
  • Focus on findings, not methods.
  • The abstract should stand alone: don’t reference external material.

Notice how he mentions searchability multiple times? Publishers want to make sure their content is findable, and librarians can help you identify appropriate keywords for your research. Have questions? Ask Us.

-Tobin Magle, PhD. Biomedical Sciences Research Support Specialist