What is a microbiome? Asking for a friend.

If you find that the word ‘microbiome’ has crept into your lexicon but you don’t really know what it means or how to use it – fear not, you’re not alone. Microbiome is a new-ish term to describe something that has been studied for almost a century: the collection of microorganisms in a dynamic ecosystem, including who they are and what they are doing.

Picture a crowd of humans. Maybe this one:

Image result for crowd oregon
Image Source: Wikimedia Commons, 2017-09-09 Oregon Ducks vs. Nebraska Cornhuskers

The picture is just one instant in an event involving hundreds or thousands of organisms that were all doing a lot of different things, sometimes for just a few seconds. How would you describe it?

Maybe using the number of members present in this community? Or a list of names of attendees? The 16S rRNA gene for prokaryotes, or the 18S rRNA or ITS genes for eukaryotes, for examples, would tell us that. Those genes are found in all types of those organisms, and is a pretty effective means of basic identification. But, it’s only as good as how often that gene is found in the organisms you are looking for. There is no one gene that’s found exactly the same in all organisms, so you might need to target multiple different identification genes to look at all the different types of microorganisms, such as bacteria, fungi, protozoa, or archaea. Viruses don’t share a common gene across types, to look at viruses you’d need something else.

From our identification genes we could identify all the organisms wearing yellow; ex. phylogenetic Family = Ducks. That wouldn’t tell us if they were always found in this ecosystem (native Eugene population) or just passing through (transient population), but we could figure that out if we looked at every home game of the season and found certain community members there time and again.

But knowing they are Ducks doesn’t tell us anything else about that community member. What will they do if it starts raining? Are they able to go mountain biking? Perhaps we could identify their potential for activity by looking at the objects they are carrying? That would be akin to metagenomics, identifying all the DNA present from all the organisms, which tells us what genes are present, but not if they are currently or ever used. It can be challenging to interpret: think of sequencing data from one organism’s genome as one 1,000,000-piece puzzle and all the genomes in a community as 1,000 1,000,000-piece puzzles all dumped in a pile. In the crowd, metagenomics would tell us who had a credit card that was specifically used to buy umbrellas, but not whether they’d actually use the umbrella if it rains (ex. Eugeneans would not).

We could describe what everyone is doing at this moment. That would be transcriptomics, identifying all the RNA to determine which genes were actively being transcribed into proteins for use in some cellular function. If we see someone in the crowd using that credit card for an umbrella (DNA), the receipt would be the RNA. RNA is a working copy you make of the DNA to take to another part of the cell and use as a blueprint to make a protein. You don’t want your entire genome moving around, or need it to make one protein, so you make a small piece of RNA that will only hang around for a short period before degrading (i.e. you crumpling that RNA receipt and throwing it away because who keeps receipts anymore).

Using transcriptomics, we’d see you were activating your money to get that umbrella, but we wouldn’t see the umbrella itself. For that, we’d need metabolomics, which uses chemistry and physics instead of genomics, in order to identify chemicals (most often proteins). Think of metabolomics as describing this crowd by all the trash and crumbs and miscellaneous items they left behind. It’s one way to know what biological processes occurred (popcorn consumption and digestion).

Image result for metabolomics
Image Source: Wikimedia Commons, Metabolomics

From a technical standpoint, researching a microbiome might mean looking at all the DNA from all the organisms present to know who they are and of what they are capable. It might also mean looking at all the RNA present, which would tell you what genes were being used by “everyone” for whatever they were doing at a particular moment. Or you might also add metabolomics to identify all the chemical metabolites, which would be all the end products of what those cells were doing, and which are more stable than RNA so they could give you data about a longer frame of time. Collectively, -omics are technology that looks at all of a certain biological substance to help you understand a dynamic community. However, it’s important to remember that each technology gives a particular view of the community and comes with its own limitations.

The Fine Art of Finding Scientific Information

Not a day goes by that I don’t search for information, and whether that information is a movie showtime or the mechanism by which a bacterial species is resistant to zinc toxicity, I need that information to be accurate. In the era of real fake-news and fake real-news, mockumentaries, and misinformation campaigns, the ability to find accurate and unbiased information is more important than ever.

Yet, assessing the validity of information and verifying sources is an under-appreciated and under-taught skill. There are some great resources available for determining the reliability (if the same results are achieved each time), and validity (is it a real effect), of a dataset, as well as of the authors. Even with fact-evaluation resources available through The National Center for Complementary and Integrated Health (NCCIS), The University of MaineThe Georgetown University Library, or Michigan State University, like any skill, finding information takes practice.

Where do I go for Science Information?

Thanks to the massive shift towards digital archiving and open-access online journals, nearly all of my information hunting is done online (and an excellent reason why Net Neutrality is vital to researchers). Most of the time, this information is in the form of  scientific journal articles or books online, and finding this information can be accomplished by using regular search engines. In particular, Google has really pushed to improve its ability to index scientific publications (critical to Google Scholar and Paperpile).

However, it takes skill to compose your search request to find accurate results. I nearly always add “journal article” or “scientific study” to the end of my query because I need the original sources of information, not popular media reports on it. This cuts out A LOT of inaccuracy in search results. If I’m looking for more general information, I might add “review” to find scientific papers which broadly summarize the results of dozens to hundreds of smaller studies on a particular topic. If I have no idea where to begin and need basic information on what I’m trying to look for, I will try my luck with a general search online or even Wikipedia (scientists have made a concerted effort to improve many science-related entries). This can help me figure out the right terminology to phrase my question.

How do I know if it’s accurate?

One of the things I’m searching for when looking for accurate sources is peer-review.  Typically, scientific manuscripts submitted to reputable journals are reviewed by 1 – 3 other authorities in that field, more if the paper goes through several journal submissions. The reviewers may know who the authors are, but the authors don’t know their reviewers until at least after publication, and sometimes never. This single-blind (or double-blind if the reviewers can’t see the authors’ names) process allows for manuscripts to be reviewed, edited, and challenged before they are published. Note that perspective or opinion pieces in journals are typically not peer-reviewed, as they don’t contain new data, just interpretation. The demand for rapid publishing rates and the rise of predatory journals has led some outlets to publish without peer-review, and I avoid those sources. The reason is that scientists might not see the flaws or errors in their own study, and having a third party question your results improves your ability to communicate those results accurately.

image description
Kriegeskorte, 2012

Another way to assess the validity of an article is the inclusion of correct control groups. The control group acts a baseline against which you can measure your treatment effects, those which go through the same experimental parameters except they don’t receive an active treatment. Instead, the group receives a placebo, because you want to make sure that the acts of experimentation and observation themselves do not lead to a reaction – The Placebo Effect. The Placebo Effect is a very real thing and can really throw off your results when working with humans.

Similarly, one study does not a scientific law make. Scientific results can be situational, or particular to the parameters in that study, and might not be generalizable (applicable to a broader audience or circumstances). It often takes dozens if not a hundred studies to get at the underlying mechanisms of an experimental effect, or to show that the effect is reliably recreated across experiments.

Data or it didn’t happen. I can’t stress this one enough. Making a claim, statement, or conclusion is hollow until you have supplied observations to prove it. This a really common problem in internet-based arguments, as people put forth references as fact when they are actually opinionated speeches or videos that don’t list their sources. These opinionated speeches have their place, I post a lot of them myself. They often say what I want to say in a much more eloquent manner. Unfortunately, they are not data and can’t prove your point.

The other reason you need data to match your statements is that in almost all scientific articles, the authors include speculation and theory of thought in the Discussion section. This is meant to provide context to the study, or ponder over the broader meaning, or identify things which need to be verified in future studies. But often these statements are repeated in other articles as if they were facts which were evaluated in the first article, and the ideas get perpetuated as proven facts instead of as theories to be tested. This often happens when the Discussion section of an article is hidden behind a pay wall and you end up taking that second paper’s word for it about what happened in the first paper. It’s only when the claim is traced all the way back to the original article that you find that someone mistook thought supposition for data exposition.

The “Echo Chamber Effect” is also prominent when it comes to translating scientific articles into news publications, a great example of which is discussed by 538. Researchers mapped the genome of about 30 transgender individuals – about half and half of male to female and female to male, to get an idea of whether gender identity could be described with a nuanced genetic fingerprint rather than a binary category. This is an extremely small sample group, and the paper was more about testing the idea and suggesting some genes which would be used for the fingerprint. In the mix-up, comments about the research were attributed to a journalist at 538 – comments that the journalist had not made, and this error was perpetuated when further news organizations used other news publications as the source instead of conducting their own interview or referencing the publication. In addition, the findings and impact of the study were wrongly reported – it was stated that 7 genes had been identified by researchers as your gender fingerprint, which is a gross exaggeration of what the original research article was really about. When possible, try to trace information back to its origin, and get comments straight from the source.

How do I know if it’s unbiased?

This can be tricky, as there are a number of ways someone can have a conflict of interest.  One giveaway is tone, as scientific texts are supposed to remain neutral. You can also check the author affiliations (who they are and what institution they are at), the conflict of interest section, and the disclosure of funding source or acknowledgements sections, all of which are common inclusions on scientific papers. “Following the money” is a particularly good way of determining if there is biased involved, depending on the reputation of the publisher.

When in doubt, try asking a librarian

There are a lot of resources online and in-person to help you find accurate information, and public libraries and databases are free to use!

Figure 7; Guadamillas Gómez, 2017.