The Bioinformatics Plunge
An attempt at making a broad question much more narrow
As preamble for this posting, I have to make a confession. I’m a redditor, and an avid one at that. I spend quite a bit of time on r/bioinformatics (which I highly recommend if you somehow found your way reading whatever I have to say on this topic) where I can blow off steam by commenting on some discussions or posts that are made on the forum. Overall, a great community over there! But something I have consistently seen since I started out on the platform has been the consistent question in the form, posted like this:
Help getting started in bioinformatics
Hi I am a [undergrad/graduate/job title] and I have [insert non-bioinformatic training]. I’m looking to make a change in my career and have been looking at Bioinformatics for [insert amount of time] but am unsure where to begin or what to do. Can I please get some assistance?!?!
I see this type of question ad nauseum on the forum, and even the moderators spend a good amount of time taking these questions off the forum because the actual ask for assistance is not truly what the point of the reddit page is for; it’s supposed to be a place to delve into the technical side of the field and additionally prompt more discussion on the latest trends and technology. To play a devil’s advocate role here however, I’d have to contend that the above question is a really good question, and I believe that it should have a solid answer that is modern and succinct. However, I think there could be a bit of an underlying issue in this question that isn’t entirely addressed when we consider someone’s initial curiosity in the field of bioinformatics: What is your ‘why’ for doing it?
What do I mean by this? If someone were to ask me the question, I could answer in one or two ways. The first, and the one that I’d argue 90% of people would go with, would be to just answer the question directly i.e. Read this review paper, do these types of projects, build your portfolio, follow certain types of researchers, get into this masters/Ph. D. program, etc. This answers the question as it is stated, and the poster of the question will be quite content with getting an answer. However, I think that in just answering the question, we aren’t really providing true guidance to the question asker. What I believe would be much more appropriate is to let them know what the field of bioinformatics is like - warts and all, so they can make an informed decision about whether to pursue it.
Bioinformatics is an old field, and in some respects way older than I even thought it could be (was blown away when I read up on Carl Woese’s experiments that paved the way for phylogenetic clustering of 16S ribosomal RNA, which was done in 1977). But, I too, am starting to find that I’ve experienced a lot of my life as a bioinformatician (10 years ago when I was given an opportunity to write Perl scripts for DNA analysis during my undergraduate. ‘What a language’ I thought to myself back then, ‘I’ll be using Perl forever!’). As ‘not fun of a thought’ it is for me, 10 years has passed and I’m getting older. And in getting old, I’ve seen a few tropes and motifs in this field (pun intended). In identifying the trials and tribulations of what the field possesses, I have my opinions. Maybe they are well-founded, perhaps they are un-founded, but I’m still going to share them (This is my corner of the internet and exert my will mightily!)
The reason that I feel we should answer beginner’s questions with a dash of realism is founded within those opinions. I grew to love the field of bioinformatics, but I’d be lying if I said that I was someone that came into the field knowing what I wanted out of it. I simply did it because I (a) was good at it, (b) got praise for it, and (c) bioinformatic research in 2014 felt like I was embarking on something that was new and exciting - and the beautiful thing about it was that it absolutely was. I remember getting into arguments with my family about not being an ‘actual doctor’ to pursue analyzing DNA on computers, but I was really secure in that whatever bioinformatics was, I was meant to do it. That feeling lead me to my Ph. D. and eventually to my PostDoc as well. Along the way however, I came to see that there was a fair share of valleys to overcome as well. Sometimes, I wish someone would sit down and give me a proper survey of what to expect in the field in times of reflection.
There are absolutely a handful of things to consider when giving a fair and honest answer of what it is like to pursue a career in bioinformatics, but I’ll constrain myself to 2 bigger pieces of advice I’d use in answering the question.
1) Are you ready to learn a ton of stuff
Typically, when the question is asked how to get better at bioinformatics, its asked by people along two extremes of expertise: The folks that know a lot about wet lab procedures but lack basic coding/math/statistics skills, or the complete opposite expertise of people that are coding/math/statistics wizards but have a limited handle on biological concepts. That makes sense, because most people who want to learn how to enter the field are generally just on it’s periphery. The question that is posed however connotes different interpretations depending on the skill level of the person that is asking it.
Bioinformatics is not just about the coding
Let’s start with the coding inclined talent - Typically asking how to get into bioinformatics comes from a more confident and secure place since the assumption is that they already possess a lot of the skills needed to carry out analysis and research (The python expertise, what a p-value is, working in command line). But sometimes I meet someone who is so very talented in the statistics and mathematical side of the field that they ignore the nuance that biological systems might present. Think about this: molecular phylogenetics can in essence be a purely computational and statistical endeavor since at it’s core we are just asking “which sequences are similar to each other?”. And we have algorithms that ignore the actual biological aspects of molecular evolution. You might know them as Neighbor-Joining, Unweighted Pair Group Method with Arithmetic Mean (UPGMA), and Minimum Spanning Trees, but these methods are typically not the final methods to understand the question of how sequences are related to each other. These methods are tied by the simplistic assumptions that neglect nuance in the evolution of sequence data and rates of molecular evolution (i.e the molecular clock). Methods such as Maximum Likelihood and Bayesian Inference are way better at incorporating differences in the biological realities of molecular evolution, and in doing so improving the field in general. That was a bit of a specific and unnecessary dive, but you can see my point - just having computational skills doesn’t guarantee the most accurate biological interpretation, and thus a deeper understanding of biology is what I believe separates bioinformaticians.
Bioinformatics is not just about the biology
I think this kind of summarizes the skills of most people coming from the biotech/wet lab side of things where there skills are technically very high. There is a ‘fuzziness’ that comes from being in the wet lab and understanding how experiments can sometimes work or not work. However, for folks that specialize in this level of biological rigor, I hear a lot of them express discomfort with the idea of having to learn how to code or learn math concepts to a deeper degree to be a useful bioinformatician. Maybe due to having a bad first run-in with a math class or a programming class, but that’s just my anecdotal thinking why biologists with predominant experience in physical experimentation don’t easily gravitate towards the ‘techy’ side of bioinformatics. But I think past the basics of just applying certain software and working on command-line, it is necessary to have a more in depth understanding of how to manipulate and analyze data towards a research end.
What do I mean here? I think a bit of self promotion might give a bit of an idea. In a future blog post, I wrote about my experience trying to salvage a genome assembly project that turned out to not work out the best for me and my research team. Let’s say that as a wet lab researcher that generated the sequencing data, finding that the reads had some extra microbes that were not what we expected would have just meant that the experiment was a failure and required another commitment at trying to sequence the algal genome. And if you don’t know quite better, this would be a pretty reasonable idea. but having the ability to go ahead and delve into the sequencing data using programmatic methods was able to help identify reads that originated from the organism we were studying while excluding others from a potential bacterial contaminate. By incorporating a little bit of bioinformatics know-how, it potentially was the thing that saved the project and the lab’s purse strings from having to redo the analysis.
TLDR: A Bioinformatician requires varied experience
The two parts above basically were my attempt at showing that a fully formed bioinformatics professional isn’t (or rather shouldn’t) be without an above average understanding of both biology and coding principles. Especially for wanting to work on tasks that are on the cutting edge of biotech (microbiome, MRD, and single cell, just to name a few).
2) Are you aware of bioinformatics use cases?
Let’s consider another dimension of answering the question - that bioinformatics is just a nebulous term to describe a bunch of different practices and procedures. When someone says they want to learn bioinformatics, I understand of course that they want to merge an interest in both biology and computation, but my issue is that this is just an incredibly vague goal if you know how intricate the field is. There are a ton of sub-disciplines that fall under the umbrella of bioinformatics such as:
Population Genomics
Microbial Ecology
Molecular Simulation
Phylogenomics
And much, much more. That being said, you can probably become really good at a select few of these fields, but not all of them. Thus, when folks ask me what is the best way to learn and delve into bioinformatics, I try to tell them to investigate what sub-fields might interest them, not to mention the particular study system they are interested in working with. I think it’s only fair, because if the person wants to dedicate a bunch of money into a Masters program or Ph. D., and let’s be real, be prepared to live at a certain level of poverty in pursuit of the degree, it’s my personal opinion that you should know specifically what you are trying to do at least.
What’s the end goal?
Fitting into this theme of knowing what you want to do prior to taking the plunge into bioinformatics, I will impart one last bit of unsolicited advice to all newcomers: work with an end goal in mind.
At the time of this writing (late 2024) I would contend that bioinformatics is in an incredibly brutal job market, and the value of the Ph. D. doesn’t quite hold the same weight that it might have prior to the COVID pandemic (speaking anecdotally here from my own experience, but speaking with friends and peers, it was not a fun job search, period). For my own job search, I realized something quite jarring, albeit understandable - that the work I was doing in my Ph. D. was not preparing me for a career in industry, as much as I thought I was making myself think it was. That’s why more than ever, being able to take a proactive role in your bioinformatic training is almost a necessity.
So what does this look like? It starts with a bit of honesty for yourself and self-reflection. For a few of my colleagues, it is the realization that academia, as it is structured right now, is not the place for some. I won’t spend too much time on the nuances of this, but essentially I would suggest to know beforehand what you ultimately want prior to plunging into the field so that you can design your training plan around your ultimate goal.
Take for example someone that is interested in working on the latest and greatest for AlphaFold. One can imagine planning their training around schools and programs that have a strong Biophysics component, or leading researchers that merge Artificial Intelligence and protein simulation. Otherwise, many would just pick a local or convenient program, and as you may or may not know, bioinformatics is not built the same at many institutions. So I really implore the uninitiated to consider for a bit what you want from the field and what type of career you want to curate for yourself.
Final Remarks
Wow, look at us. We really made some progress towards understanding what it means to be a bioinformatician and how to dive deeper in the field. Truly, I wasn’t thinking that I’d type this much on the subject, but it turns out that I am pretty opinionated (who would have thought?!).
Regardless of why you are asking the question, if no one has matched your interest and enthusiasm about learning about the field, then all I have to say is welcome! It’s not entirely straightforward and sometimes makes you want to throw your keyboard - but it’s the field I love. I’m glad you want to explore it, too.
Continuously dividing,
N.L.


The dive into algorithms and the biological aspects of evolution was wholly necessary. I've learned something new from that and the entirety of this post. Also, it's my first time hearing about Carl Woese. I'll check out his work since I'm currently working on clustering 16s rRNA sequences.