What is a microbiome? Asking for a friend.

If you find that the word ‘microbiome’ has crept into your lexicon but you don’t really know what it means or how to use it – fear not, you’re not alone. Microbiome is a new-ish term to describe something that has been studied for almost a century: the collection of microorganisms in a dynamic ecosystem, including who they are and what they are doing.

Picture a crowd of humans. Maybe this one:

Image result for crowd oregon
Image Source: Wikimedia Commons, 2017-09-09 Oregon Ducks vs. Nebraska Cornhuskers

The picture is just one instant in an event involving hundreds or thousands of organisms that were all doing a lot of different things, sometimes for just a few seconds. How would you describe it?

Maybe using the number of members present in this community? Or a list of names of attendees? The 16S rRNA gene for prokaryotes, or the 18S rRNA or ITS genes for eukaryotes, for examples, would tell us that. Those genes are found in all types of those organisms, and is a pretty effective means of basic identification. But, it’s only as good as how often that gene is found in the organisms you are looking for. There is no one gene that’s found exactly the same in all organisms, so you might need to target multiple different identification genes to look at all the different types of microorganisms, such as bacteria, fungi, protozoa, or archaea. Viruses don’t share a common gene across types, to look at viruses you’d need something else.

From our identification genes we could identify all the organisms wearing yellow; ex. phylogenetic Family = Ducks. That wouldn’t tell us if they were always found in this ecosystem (native Eugene population) or just passing through (transient population), but we could figure that out if we looked at every home game of the season and found certain community members there time and again.

But knowing they are Ducks doesn’t tell us anything else about that community member. What will they do if it starts raining? Are they able to go mountain biking? Perhaps we could identify their potential for activity by looking at the objects they are carrying? That would be akin to metagenomics, identifying all the DNA present from all the organisms, which tells us what genes are present, but not if they are currently or ever used. It can be challenging to interpret: think of sequencing data from one organism’s genome as one 1,000,000-piece puzzle and all the genomes in a community as 1,000 1,000,000-piece puzzles all dumped in a pile. In the crowd, metagenomics would tell us who had a credit card that was specifically used to buy umbrellas, but not whether they’d actually use the umbrella if it rains (ex. Eugeneans would not).

We could describe what everyone is doing at this moment. That would be transcriptomics, identifying all the RNA to determine which genes were actively being transcribed into proteins for use in some cellular function. If we see someone in the crowd using that credit card for an umbrella (DNA), the receipt would be the RNA. RNA is a working copy you make of the DNA to take to another part of the cell and use as a blueprint to make a protein. You don’t want your entire genome moving around, or need it to make one protein, so you make a small piece of RNA that will only hang around for a short period before degrading (i.e. you crumpling that RNA receipt and throwing it away because who keeps receipts anymore).

Using transcriptomics, we’d see you were activating your money to get that umbrella, but we wouldn’t see the umbrella itself. For that, we’d need metabolomics, which uses chemistry and physics instead of genomics, in order to identify chemicals (most often proteins). Think of metabolomics as describing this crowd by all the trash and crumbs and miscellaneous items they left behind. It’s one way to know what biological processes occurred (popcorn consumption and digestion).

Image result for metabolomics
Image Source: Wikimedia Commons, Metabolomics

From a technical standpoint, researching a microbiome might mean looking at all the DNA from all the organisms present to know who they are and of what they are capable. It might also mean looking at all the RNA present, which would tell you what genes were being used by “everyone” for whatever they were doing at a particular moment. Or you might also add metabolomics to identify all the chemical metabolites, which would be all the end products of what those cells were doing, and which are more stable than RNA so they could give you data about a longer frame of time. Collectively, -omics are technology that looks at all of a certain biological substance to help you understand a dynamic community. However, it’s important to remember that each technology gives a particular view of the community and comes with its own limitations.

A visit from Bozeman

Last year, one of my former research groups at Montana State University was awarded a USDA NIFA Foundational program grant, and I am a sub-award PI on that grant.  We’ll be working together to investigate the effect of diversified farming systems – such as those that use cover crops, rotations, or integrate livestock grazing into field management – on crop production and soil bacterial communities: “Diversifying cropping systems through cover crops and targeted grazing: impacts on plant-microbe-insect interactions, yield and economic returns.”

The first soil samples were collected in Montana this summer, and I have been processing them for the past few weeks. I am using the opportunity to train a master’s student on microbiology and molecular genetics lab work. 

Tindall Ouverson started this fall as a master’s student at MSU, working with Fabian Menalled and Tim Seipel in Bozeman, MT.  She’s an environmental and soil scientist, and this is her first time working with microbes.  She was here in Eugene for just a few days to learn everything needed for sequencing: DNA extraction, polymerase chain reaction, gel electrophoresis and visualization, DNA cleanup using magnetic beads, quantification, and pooling.  Despite not having experience in microbiology or molecular biology, Tindall showed a real aptitude and picked up the techniques faster than I expected!

Once the sequences are generated, I’ll be (remotely) training Tindall on DNA sequence analysis.  I’ll also be serving as one of her thesis committee members! Tindall will be the first of (hopefully) many cross-trained graduate students between myself and collaborators at MSU.

(Reblog) A Tale of Two Cohorts

Original posting on BioBE.

Sequence data contamination from biological or digital sources can obscure true results and falsely raise one’s hopes.  Contamination is a persist issue in microbial ecology, and each experiment faces unique challenges from a myriad of sources, which I have previously discussed.  In microbiology, those microscopic stowaways and spurious sequencing errors can be difficult to identify as non-sample contaminants, and collectively they can create large-scale changes to what you think a microbial community looks like.

Samples from large studies are often processed in batches based on how many samples can be processed by certain laboratory equipment, and if these span multiple bottles of reagents, or water-filtration systems, each batch might end up with a unique contamination profile.  If your samples are not randomized between batches, and each batch ends up representing a specific time point or a treatment from your experiment, these batch effects can be mistaken for a treatment effect (a.k.a. a false positive).

Due to the high cost of sequencing, and the technical and analytical artistry required for contamination identification and removal, batch effects have long plagued molecular biology and genetics.  Only recently have the pathologies of batch effects been revealed in a harsher light, thanks to more sophisticated analysis techniques (examples here and here and here) and projects dedicated to tracking contamination through a laboratory pipeline.  To further complicate the issue, sources of and practical responses to contamination in fungal data sets is quite different than that of bacterial data sets.

Chapter 1

“The times were statistically greater than prior time periods, while simultaneously being statistically lesser to prior times, according to longitudinal analysis.”

Over the past year, I analyzed a particularly complex bacterial 16S rRNA gene sequence data set, comprising nearly 600 home dust samples, and about 90 controls.  Samples were collected from three climate regions in Oregon, over a span of one year, in which homes were sampled before and approximately six weeks after a home-specific weatherization improvement (treatment homes) or simply six weeks later in (comparison) homes which were eligible for weatherization but did not receive it.  As these samples were collected over a span of a year, they were extracted with two different sequencing kits and multiple DNA extraction batches, although all within a short time after collection. The extracted DNA was spread across two sequence runs to allow for data processing to begin on cohort 1, while we waited for cohort 2 homes to be weatherized.  Thus, there were a lot of opportunities to introduce technical error or biological contamination that could be conflated with treatment effects.

On top of this, each home was unique, with it’s own human and animal occupants, architectural and interior design, plants, compost, and quirks, and we didn’t ask homeowners to modify their behavior in any way.  This was important, as it meant each of the homes – and their microbiomes – are somewhat unique.  Therefore I didn’t want to remove sequences which might be contaminants on the basis of low abundance and risk removing microbial community members which were specific to that home.  After the typical quality assurance steps to curate and process the data, which can be found on GitHub as an R script of a DADA2 package workflow, I needed to decide what to do with the negative controls.

Because sequencing is expensive, most of the time there is only one negative control included in sequencing library preparation, if that.  The negative control is a blank sample – just water, or an unused swab –  which does not intentionally contain cells or nucleic acids. Thus anything you find there will have come from contamination. The negative control can be used to normalize the relative abundance numbers – if you find 1,000 sequences in the negative control, which is supposed to have no DNA in it, then you might only continue looking at samples with a certain amount higher than 1,000 sequences. This risks throwing out valid sequences that happen to be rare. Alternatively, you can try to identify the contaminants and remove whole taxa from your data set, risking the complete removal of valid taxa.

I had three types of negative controls: sterile DNA swabs which were processed to check for biological contamination in collection materials, kit controls where a blank extraction was run for each batch of extractions to test for biological contamination in extraction reagents, and PCR negative controls to check for DNA contamination of PCR reagents. In total, 90 control samples were sequenced, giving me unprecedented resolution to deal with contamination. Looking at the total number of sequences before and after my quality-analysis processing, I can see that the number of sequences in my negative controls reduces dramatically; they were low-quality in some way and might be sequencing artifacts. But, an unsatisfactory number remain after QA filtering; these are high-quality and likely come from microbial contamination.

This slideshow requires JavaScript.

I wasn’t sure how I wanted to deal with each type of control. I came up with three approaches, and then looked at unweighted, non-rarefied ordination plots (PCoA) to watch how my axes changed based on important components (factors).  What follows is a narrative summarize of what I did, but I included the R script of my phyloseq package workflow and workaround on GitHub.

Chapter 2

“In microbial ecology, preprints are posted on late November nights. The foreboding atmosphere of conflated factors makes everyone uneasy.”

Ordination plots visualize lots of complex communities together. In both ordination figures below, each point on the graph represents a dust sample from one house. They are clustered by community distance: those closer together on the plot have a more similar community than points which are further away from each other.  The points are shaped by the location of the samples, including Bend, Eugene, Portland, along with a few pilot samples labeled “Out”, and negative controls which have no location (not pictured but listed as NA).  The points are colored by DNA extraction b

PCoA cohort 1, prior to cleaning out negative controls
Figure 1 Ordination of home samples prior to removing contaminants found in negative controls.  

In Figure 1, the primary axis (axis 1) shows a clear clustering of samples by DNA extraction batch, but this is also mixed with geographic location, and as it turns out – date of collection and sequencing run.  We know from other studies that geographic location, date of collection, and sequencing batch can all affect the microbial community.

Approach 1: Subtraction + outright removal

This approach subsets my data into DNA extraction batches, and then uses the number of sequences found in the negative controls to subtract out sequences from my dust samples.  This assumes that if a particular sequence showed up 10 times in my negative control, but 50 times in my dust samples, that only 40 of those in my dust sample were real. For each of my DNA extraction batch negative control samples, I obtained the sum of each potential contaminant that I found there, and then subtracted those sums from the same sequence columns in my dust samples.

Screen Shot 2018-08-24 at 5.34.03 PM.png
Figure 2 Ordination of home samples after removing contaminants found in negative controls, particular to each batch, using approach 1.  

Approach 1 was alright, but there was still an effect of DNA extraction batch (indicated by color scale) that was stronger than location or treatment (not included on this graph). This approach is also more pertinent for working with OTUs, or situations where you wouldn’t want to remove the whole OTU, just subtract out a certain number sequences from specific columns. There is currently no way to do that just from phyloseq, so I made a work-around (see the GitHub page). However, using DADA2 gives you Sequence Variants, which are more precise and I found it’s better to remove them with approach 3.

Approach 2: Total Removal

This approach removes any contaminant sequences that is found in ANY of the negative controls from ALL the house samples, regardless of which negative control was for which extraction batch. This approach assumes that if it a sequence was found as a contaminant in a negative control somewhere, that it is a contaminant everywhere.

Screen Shot 2018-08-24 at 5.34.16 PM.png
Figure 3 Ordination of home samples after removing contaminants found in negative controls, particular to each batch, using approach 2.  

Once again, approach 2 was alright, and now that primary axis (axis 1) of potential batch effect is now my secondary axis; so there is still an effect of DNA extraction batch (indicated by color scale) but it is weaker. When I recolor by different variables, there is much more clustering by Treatment than by any batch effects. However, that second axis is also one of my time variables, so don’t want to get rid of all of the variation on that axis. But, since my negative kit controls showed a lot of variation in number and types of taxa, I don’t want to remove everything found there from all samples indiscriminately.

Additionally, I don’t favor throwing sequences out just because they were a contaminant somewhere, particularly for dust samples. Contamination can be situational, particularly if a microbe is found in the local air or water supply and would be legitimately found in house dust but would have also accidentally gotten into the extraction process.

Approach 3: “To each its own”

This approach removes all the sequences from PCR and swab contaminant SVs fully from each cohort, respectively, and removes extraction kit contaminants fully from each DNA extraction batch, respectively. I took all the sequences of the SVs found in my dust samples and made them into a vector (list), and then I took all the sequences of the SVs found in my controls and made them into a different vector.  I effectively subtracted out the contaminant SVs by name, but asking to find the sequences which were different between my two lists (thus returning the sequences which were in my dust samples but not in my control samples).  I did this respective to each sequencing cohort and batch, so that I only remove the pertinent sequences (ex. using kit control 1 to subtract from DNA extraction batch 1).

PCoA cohort 1, after cleaning out negative controls, approach 3, ext. batch.png
Figure 4 Ordination of home samples after removing contaminants found in negative controls, particular to each batch, using approach 3.  

In Figure 4, potential batch effect is solidly my secondary axis and not the primary driving force behind clustering. The primary axis (axis 1) shows a clear separation by climate zone, or location of homes, once the batch contamination has been removed.  When I recolor by different variables, there is much more clustering by Treatment and almost none by batch effects. I say almost none, because some of my DNA extraction batches also happen to be Treatment batches, as they represent a subset of samples from a different location. Thus, I can’t tell if those samples cluster separately solely because of location or also because of batch effect. However, I am satisfied with the results and ready to move on.

Unlike its namesake, this tale has a happier ending.

(Reblog) A perspective on tackling contamination in microbial ecology

Original posting from BioBE.

To study DNA or RNA, there are a number of “wet-lab” (laboratory) and “dry-lab” (analysis) steps which are required to access the genetic code from inside cells, polish it to a high-sheen such that the delicate technology we rely on can use it, and then make sense of it all.  Destructive enzymes must be removed, one strand of DNA must be turned into millions of strands so that collectively they create a measurable signal for sequencing, and contamination must be removed.  Yet, what constitutes contamination, and when or how to deal with it, remains an actively debated topic in science. Major contamination sources include human handlers, non-sterile laboratory materials, other samples during processing, and artificial generation due to technological quirks.

Contamination from human handlers

This one is easiest to understand; we constantly shed microorganisms and our own cells and these aerosolized cells may fall into samples during collection or processing.  This might be of minimal concern working with feces, where the sheer number of microbial cells in a single teaspoon swamp the number that you might have shed into it, or it may be of vital concern when investigating house dust which not only has comparatively few cells and little diversity, but is also expected to have a large amount of human-associated microorganisms present.  To combat this, researchers wear personal protective equipment (PPE) which protects you from your samples and your samples from you, and work in biosafety cabinets which use laminar air flow to prevent your microbial cloud from floating onto your workstation and samples.

Fun fact, many photos in laboratories are staged, including this one, of me as a grad student.  I’m just pretending to work.  Reflective surfaces, lighting, cramped spaces, busy scenes, and difficulty in positioning oneself makes “action shots” difficult.  That’s why many lab photos are staged, and often lack PPE.

sue_02_small
Photo Credit: Kristina Drobny

Contamination from laboratory materials

Microbiology or molecular biology laboratory materials are sterilized before and between uses, perhaps using chemicals (ex. 70% ethanol), an ultraviolet lamp, or autoclaving which combines heat and pressure to destroy, and which can be used to sterilize liquids, biological material, clothing, metal, some plastics, etc.  However, microorganisms can be tough – really tough, and can sometimes survive the harsh cleaning protocols we use.  Or, their DNA can survive, and get picked up by sequencing techniques that don’t discriminate between live and dead cellular DNA.

In addition to careful adherence to protocols, some of this biologically-sourced contamination can be handled in analysis.  A survey of human cell RNA sequence libraries found widespread contamination by bacterial RNA, which was attributed to environmental contamination.  The paper includes an interesting discussion on how to correct this bioinformatically, as well as a perspective on contamination.  Likewise, you can simply remove sequences belonging to certain taxa during quality control steps in sequence processing. There are a number of hardy bacteria that have been commonly found in laboratory reagents and are considered contaminants, the trouble is that many of these are also found in the environment, and in certain cases may be real community members.  Should one throw the Bradyrhizobium out with the laboratory water bath?

Chimeras

Like the mythical creatures these are named for, sequence chimeras are DNA (or cDNA) strands which are accidentally created when two other DNA strands merged.  Chimeric sequences can be made up of more than two DNA strand parents, but the probability of that is much lower.  Chimeras occur during PCR, which takes one strand of genetic code and makes thousands to millions of copies, and a process used in nearly all sequencing workflows at some point.  If there is an uneven voltage supplied to the machine, the amplification process can hiccup, producing partial DNA strands which can concatenate and produce a new strand, which might be confused for a new species.  These can be removed during analysis by comparing the first and second half of each of your sequences to a reference database of sequences.  If each half matches to a different “parent”, it is deemed chimeric and removed.

1024px-Splicing_by_Overlap_Extension_PCR.svg
Chimeric DNA

Cross – sample contamination

During DNA or RNA extraction, genetic code can be flicked from one sample to another during any number of wash or shaking steps, or if droplets are flicked from fast moving pipettes.  This can be mitigated by properly sealing all sample containers or plates, moving slowly and carefully controlling your technique, or using precision robots which have been programmed with exacting detail — down to the curvature of the tube used, the amount and viscosity of the liquid, and how fast you want to pipette to move, so that the computer can calculate the pressure needed to perform each task.  Sequencing machines are extremely expensive, and many labs are moving towards shared facilities or third-party service providers, both of which may use proprietary protocols.  This makes it more difficult to track possible contamination, as was the case in a recent study using RNA; the researchers found that much of the sample-sample contamination occurred at the facility or in shipping, and that this negatively affected their ability to properly analyze trends in the data.

Sample-sample contamination during sequencing

Controlling sample-sample contamination during sequencing, however, is much more difficult to control. Each sequencing technology was designed with a different research goal in mind, for example, some generate an immense amount of short reads to get high resolution on specific areas, while others aim to get the longest continuous piece of DNA sequenced as possible before the reaction fails or become unreliable.  they each come with their own quirks and potential for quality control failures.

Due to the high cost of sequencing, and the practicality that most microbiome studies don’t require more than 10,000 reads per sample, it is very common to pool samples during a run.  During wet-lab processing to prepare your biological samples into a “sequencing library”, a unique piece of artificial “DNA” called a barcode, tag, or index, is added to all the pieces of genetic code in a single sample (in reality, this is not DNA but a single strand of nucleotides without any of DNA’s bells and whistles).  Each of your samples gets a different barcode, and then all your samples can be mixed together in a “pool”.  After sequencing the pool, your computer program can sort the sequences back into their respective samples using those barcodes.

While this technique has made sequencing significantly cheaper, it adds other complications.  For example, Illumina MiSeq machines generate a certain number of sequence reads (about 200 million right now) which are divided up among the samples in that run (like a pie).   The samples are added to a sequencing plate or flow cell (for things like Illumina MiSeq).  The flow cells have multiple lanes where samples can be added; if you add a smaller number of samples to each lane, the machine will generate more sequences per sample, and if you add a larger number of samples, each one has fewer sequences at the end of the run. you have contamination.  One drawback to this is that positive controls always sequence really well, much better than your low-biomass biological samples, which can mean that your samples do not generate many sequences during a run or means that tag switching is encouraged from your high-biomass samples to your low-biomass samples.

illumina-gaiix-for-high-throughput-sequencing-15-728
Illumina GAIIx for high-throughput sequencing.

Cross-contamination can happen on a flow cell when the sample pool wasn’t thoroughly cleaned of adapters or primers, and there are great explanations of this here and here.  To generate many copies of genetic code from a single strand, you mimic DNA replication in the lab by providing all the basic ingredients (process described here).   To do that, you need to add a primer (just like with painting) which can attach to your sample DNA at a specific site and act as scaffolding for your enzyme to attach to the sample DNA and start adding bases to form a complimentary strand.  Adapters are just primers with barcodes and the sequencing primer already attached.   Primers and adapters are small strands, roughly 10 to 50 nucleotides long, and are much shorter than your DNA of interest, which is generally 100 to 1000 nucleotides long.  There are a number of methods to remove them, but if they hang around and make it to the sequencing run, they can be incorporated incorrectly and make it seem like a sequence belongs to a different sample.

MB512
DNA Purification

 

barcode_swap_mechanism.png
Barcode swapping

This may sound easy to fix, but sequencing library preparation already goes through a lot of stringent cleaning procedures to remove everything but the DNA (or RNA) strands you want to work with.  It’s so stringent, that the problem of barcode swapping, also known as tag switching or index hopping, was not immediately apparent.  Even when it is noted, it typically affects a small number of the total sequences.  This may not be an issue, if you are working with rumen samples and are only interested in sequences which represent >1% of your total abundance.  But it can really be an issue in low biomass samples, such as air or dust, particularly in hospitals or clean rooms.  If you were trying to determine whether healthy adults were carrying but not infected by the pathogen C. difficile in their GI tract, you would be very interested in the presence of even one C. difficile sequence and would want to be extremely sure of which sample it came from.  Tag switching can be made worse by combining samples from very different sample types or genetic code targets on the same run.

There are a number of articles proposing methods of dealing with tag switching using double tags to reduce confusion or other primer design techniques, computational correction or variance stabilization of the sequence data, identification and removal of contaminant sequences, or utilizing synthetic mock controls.  Mock controls are microbial communities which have been created in the lab by mixed a few dozen microbial cultures together, and are used as a positive control to ensure your procedures are working.  because you are adding the cells to the sample yourself, you can control the relative concentrations of each species which can act as a standard to estimate the number of cells that might be in your biological samples.  Synthetic mock controls don’t use real organisms, they instead use synthetically created DNA to act as artificial “organisms”. If you find these in a biological sample, you know you have contamination.  One drawback to this is that positive controls always sequence really well, much better than your low-biomass biological samples, which can mean that your samples do not generate many sequences during a run or means that tag switching is encouraged from your high-biomass samples to your low-biomass samples.

Incorrect base calls

Cross-contamination during sequencing can also be a solely bioinformatic problem – since many of the barcodes are only a few nucleotides (10 or 12 being the most commonly used), if the computer misinterprets the bases it thinks was just added, it can interpret the barcode as being a different one and attribute that sequence to being from a different sample than it was.  This may not be a problem if there aren’t many incorrect sequences generated and it falls below the threshold of what is “important because it is abundant”, but again, it can be a problem if you are looking for the presence of perhaps just a few hundred cells.

Implications

When researching environments that have very low biomass, such as air, dust, and hospital or cleanroom surfaces, there are very few microbial cells to begin with.  Adding even a few dozen or several hundred cells can make a dramatic impactinto what that microbial community looks like, and can confound findings.

Collectively, contamination issues can lead to batch effects, where all the samples that were processed together have similar contamination.  This can be confused with an actual treatment effect if you aren’t careful in how you process your samples.  For example, if all your samples from timepoint 1 were extracted, amplified, and sequenced together, and all your samples from timepoint 2 were extracted, amplified, and sequenced together later, you might find that timepoint 1 and 2 have significantly different bacterial communities.  If this was because a large number of low-abundance species were responsible for that change, you wouldn’t really know if that was because the community had changed subtly or if it was because of the collective effect of low-level contamination.

Stay tuned for a piece on batch effects in sequencing!

 

 

Menalled lab at MSU seeking graduate students

The Menalled lab has MS and PhD opportunities in agroecology, “Diversifying cropping systems through cover crops and targeted grazing: impacts on plant-microbe-insect interactions, yield, and economic returns”.

Last year, I did a post-doc in Dr. Fabian Menalled’s weed ecology lab at MSU exploring the effect of farming system and climate change on bacteria in the wheat rhizosphere.  If you love friendly lab groups, early morning field work, and being outside, then working in the Menalled lab in Bozeman, Montana might be the place for you.

Of course, in Montana, it helps if you also love winter…

A collaborative paper on how rumen acidosis affects fungi and protozoa got published!

Ruminal acidosis is a condition in which the pH of the rumen is considerably lower than normal, and if severe enough can cause damage to the stomach and localized symptoms, or systemic illness in cows.  Often, these symptoms result from the low pH reducing the ability of microorganisms to ferment fiber, or by killing them outright.  Since the cow can’t break down most of its plant-based diet without these microorganisms, this disruption can cause all sorts of downstream health problems.  Negative health effects can also occur when the pH is somewhat lowered, or is lowered briefly but repeatedly, even if the cow isn’t showing outward clinical symptoms.  This is known as sub-acute ruminal acidosis (SARA), and can also cause serious side effects for cows and an economic loss for producers.

In livestock, acidosis usually occurs when ruminants are abruptly switched to a highly-fermentable diet- something with a lot of grain/starch that causes a dramatic increase in bacterial fermentation and a buildup of lactate in the rumen.  To prevent this, animals are transitioned incrementally from one diet to the next over a period of days or weeks.  Another strategy is to add something to the diet to help buffer rumen pH, such as a probiotic.  One of the most common species used to help treat or prevent acidosis is a yeast; Saccharomyces cerevisiae.

This paper was part of a larger study on S. cerevisiae use in cattle to treat SARA, the effects of which on animal production as well as bacterial diversity and functionality have already been published by an old friend and colleague of mine, Dr. Ousama AlZahal, and several others.  In total, very little work has been done on the effect of SARA or S. cerevisiae treatment on the fungal or protozoal diversity in the rumen, which is what I added to this study.  I was very pleased to be invited to analyze and interpret some of the data, as well as to present the results at a conference in Chicago earlier this year.  The article itself has just been published in Frontiers in Microbiology!


An investigation into rumen fungal and protozoal diversity in three rumen fractions, during high-fiber or grain-induced sub-acute ruminal acidosis conditions, with or without active dry yeast supplementation.

Authors: Suzanne L. Ishaq, Ousama AlZahal, Nicola Walker, Brian McBride

Sub-acute ruminal acidosis (SARA) is a gastrointestinal functional disorder in livestock characterized by low rumen pH, which reduces rumen function, microbial diversity, host performance, and host immune function. Dietary management is used to prevent SARA, often with yeast supplementation as a pH buffer. Almost nothing is known about the effect of SARA or yeast supplementation on ruminal protozoal and fungal diversity, despite their roles in fiber degradation. Dairy cows were switched from a high-fiber to high-grain diet abruptly to induce SARA, with and without active dry yeast (ADY, Saccharomyces cerevisiae) supplementation, and sampled from the rumen fluid, solids, and epimural fractions to determine microbial diversity using the protozoal 18S rRNA and the fungal ITS1 genes via Illumina MiSeq sequencing. Diet-induced SARA dramatically increased the number and abundance of rare fungal taxa, even in fluid fractions where total reads were very low, and reduced protozoal diversity. SARA selected for more lactic-acid utilizing taxa, and fewer fiber-degrading taxa. ADY treatment increased fungal richness (OTUs) but not diversity (Inverse Simpson, Shannon), but increased protozoal richness and diversity in some fractions. ADY treatment itself significantly (P < 0.05) affected the abundance of numerous fungal genera as seen in the high-fiber diet: Lewia, Neocallimastix, and Phoma were increased, while Alternaria, Candida Orpinomyces, and Piromyces spp. were decreased. Likewise, for protozoa, ADY itself increased Isotricha intestinalis but decreased Entodinium furca spp. Multivariate analyses showed diet type was most significant in driving diversity, followed by yeast treatment, for AMOVA, ANOSIM, and weighted UniFrac. Diet, ADY, and location were all significant factors for fungi (PERMANOVA, P = 0.0001, P = 0.0452, P = 0.0068, Monte Carlo correction, respectively, and location was a significant factor (P = 0.001, Monte Carlo correction) for protozoa. Diet-induced SARA shifts diversity of rumen fungi and protozoa and selects against fiber-degrading species. Supplementation with ADY mitigated this reduction in protozoa, presumptively by triggering microbial diversity shifts (as seen even in the high-fiber diet) that resulted in pH stabilization. ADY did not recover the initial community structure that was seen in pre-SARA conditions.

A collaborative project on juniper diets in lambs was published!

In 2015, while working in the Yeoman Lab, I was invited to perform the sequence analysis on some samples from a previously-run diet study.  The study was part of ongoing research by Dr. Travis Whitney at Texas A & M on the use of juniper as a feed additive for sheep.  The three main juniper species in Texas can pose a problem- while they are native, they have significantly increased the number of acres they occupy due to changes in climate, water availability, and human-related land use.  And, juniper can out-compete other rangeland species, which can make forage less palatable, less nutritious, or unhealthy for livestock.  Juniper contains essential oils and compounds which can affect some microorganisms living in their gut.  We wanted to know how the bacterial community in the rumen might restructure while on different concentrations of juniper and urea.

Coupled with the animal health and physiology aspect led by Travis, we published two companion papers in the Journal of Animal Science.  We had also previously presented these results at the Joint Annual Meeting of the American Society for Animal Science, the American Dairy Science Association, and the Canadian Society for Animal Science in Salt Lake City, UT in 2016.  Travis’ presentation can be found here, and mine can be found here.  The article can be found here.


Ground redberry juniper and urea in supplements fed to Rambouillet ewe lambs.

Part 1: Growth, blood serum and fecal characteristics, T.R. Whitney

Part 2: Ewe lamb rumen microbial communities, S. L. Ishaq, C. J. Yeoman, and T. R. Whitney

This study evaluated effects of ground redberry juniper (Juniperus pinchotii) and urea in dried distillers grains with solubles-based supplements fed to Rambouillet ewe lambs (n = 48) on rumen physiological parameters and bacterial diversity. In a randomized study (40 d), individually-penned lambs were fed ad libitum ground sorghum-sudangrass hay and of 1 of 8 supplements (6 lambs/treatment; 533 g/d; as-fed basis) in a 4 × 2 factorial design with 4 concentrations of ground juniper (15%, 30%, 45%, or 60% of DM) and 2 levels of urea (1% or 3% of DM). Increasing juniper resulted in minor changes in microbial β-diversity (PERMANOVA, pseudo F = 1.33, P = 0.04); however, concentrations of urea did not show detectable broad-scale differences at phylum, family, or genus levels according to ANOSIM (P> 0.05), AMOVA (P > 0.10), and PERMANOVA (P > 0.05). Linear discriminant analysis indicated some genera were specific to certain dietary treatments (P < 0.05), though none of these genera were present in high abundance; high concentrations of juniper were associated with Moraxella and Streptococcus, low concentrations of urea were associated with Fretibacterium, and high concentrations of urea were associated with Oribacterium and PyramidobacterPrevotella were decreased by juniper and urea. RuminococcusButyrivibrio, and Succiniclasticum increased with juniper and were positively correlated (Spearman’s, P < 0.05) with each other but not to rumen factors, suggesting a symbiotic interaction. Overall, there was not a juniper × urea interaction for total VFA, VFA by concentration or percent total, pH, or ammonia (P > 0.29). When considering only percent inclusion of juniper, ruminal pH and proportion of acetic acid linearly increased (P < 0.001) and percentage of butyric acid linearly decreased (P = 0.009). Lamb ADG and G:F were positively correlated with Prevotella(Spearman’s, P < 0.05) and negatively correlated with Synergistaceae, the BS5 group, and Lentisphaerae. Firmicutes were negatively correlated with serum urea nitrogen, ammonia, total VFA, total acetate, and total propionate. Overall, modest differences in bacterial diversity among treatments occurred in the abundance or evenness of several OTUs, but there was not a significant difference in OTU richness. As diversity was largely unchanged, the reduction in ADG and lower-end BW was likely due to reduced DMI rather than a reduction in microbial fermentative ability.

Field notes from my first ESA meeting

IMG_20170808_083850[4637].jpg
From iDigBio
A couple of weeks ago, I attended my first Ecological Society of America meeting in Portland, which assembles a diverse community of researchers looking at system-wide processes.  It was an excellent learning experience for me, as scientific fields each have a particular set of tools to look at different problems and our collective perspectives can solve research problems in more creative ways.

In particular, it was intriguing to attend talks on the ecology of the human microbiome.  Due to the complexity of host-associated microbial communities, and the limitations of technology, the majority of studies to date have been somewhat observational.  We have mapped what is present in different animals, in different areas of the body, under different diet conditions, in different parts of the world, and in comparison between healthy and disease states.  But given the complexity of the day-to-day life of people, and ethics or technical difficulty of doing experimental studies in humans, many of the broader ecological questions have yet to be answered.

For example, how quickly do microbial communities assemble in humans?  When you disturb them or change something (like adding a medication or removing a food from your diet) how quickly does this manifest in the community structure and do those changes last? How does dysbiosis or dysfunction in the body specifically contribute to changes in the microbial community, or do seemingly harmless events trigger a change in the microbial community which then causes disease in humans? Some of the presentations I attended have begun teasing out these problems with a combination of observational in situ biological studies, in vitro laboratory studies, and in silico mathematical modeling.  The abstracts from all the meeting presentations can be found on the meeting website under Program.  I have also summarized several of the talks I went to on Give Me The Short Version.

One of my favorite parts was attending an open lunch with 500 Women Scientists, a recently-formed organization which promotes diversity and equality in science, and supports local activists to help change policy and preconceived notions about diversity in STEM.  The lunch meeting introduced the organization to the conference participants in attendance, asked us to voice our concerns or difficulties we had faced, encouraged us to reach out to others in our work network to seek advice and provide mentoring, and walked us through exercises designed to educate on how to build a more inclusive society.

500womensci.JPG
500 Women Scientists at ESA, August 2017

My poster presentation was on Wednesday, halfway through the meeting week, which gave me plenty of time to prepare.  You never know who might show up at your poster and what questions they’ll have.  In the past, I’ve always had a steady stream of people to chat with at my poster which has led to a number of scientific friendships and networking, and this year was no different.  The rather large (but detailed) poster file can be found here: Ishaq et al ESA 2017 poster .  Keep in mind that this is preliminary work, and many statistical tests have not yet been applied or verified.  I’ve been working to complete the analysis on the large study, which also encompasses a great deal of environmental data.  We hope to have manuscript drafted by this fall on this part of the project, and several more over the next year from the research team as this is part of a larger study; stay tuned!

In preparation for the ESA conference next week

I’m counting down the days for my first Ecological Society of America (ESA) conference next week in Portland, OR.  Over the last few weeks, I’ve been diligently working to finish as much analysis as possible on the data from my recent post-doc, as I am presenting a poster on Wednesday, August 9th from 4:30 to 6:30 pm; PS 31-13 – Soil bacterial diversity in response to stress from farming system, climate change, weed diversity, and wheat streak virus.

Several of my new colleagues will also be presenting on their recent work, including a talk from Roo Vandegrift on the built environment and the microbiome of human skin, and one from Ashkaan Fahimipour on the dynamics of food webs.

The theme for this year’s ESA meeting is “Linking biodiversity, material cycling and ecosystem services in a changing world”, and judging from the extravagant list of presenting authors, it’s going to be an extremely large meeting.  It’s worth remembering that large conferences like these bring together researchers from each rung of the career ladder, and many of the invited speakers will be presenting on work that might have been done by dozens of scientists over decades.  Seeing only the polished summary can be intimidating, lots of scientists I’ve spoken to can feel intimidated by these comprehensive meeting talks because the speakers seem so much smarter and more successful than you.  It’s something I jokingly refer to as “pipette envy”: when you are at a conference thinking that everyone does cooler science than you.  Just remember, someone also deemed your work good enough to present at the same conference!

Harvesting a feast of data

My greenhouse trial on the legacy effects of farming systems and climate change has concluded!  Over this past fall and winter, I maintained a total of 648 pots across three replicate trials (216 trials per).  In the past few weeks, we harvested the plants and took various measurements: all-day affairs that required the help of several dedicated undergraduate researchers.

In case you were wondering why research can be so time and labor intensive, over the course of the trials we hand-washed 648 pot tags twice, 648 plant pots twice, planted 7,776 wheat seeds across two conditioning phases, 1,944 wheat seeds and 1,944 pea seeds for the response phase.  We counted seedling emergence for those seeds every day for a week after each of the three planting dates in each of the three trials (9 plantings all together).  Of those 11,664 plants, we hand-plucked 7,776 seedlings and grew the other 3,888 until harvesting which required watering nearly every day for over four months.  At harvest, we counted wheat tillers or pea flowers, as well as weighed the biomass on those 3,888, and measured the height on 1,296 of them.  And this is only a side study to the larger field trial I am helping conduct!  All told, we have a massive amount of data to process, but we hope to have a manuscript ready by mid-summer – stay tuned!

This slideshow requires JavaScript.