Can you use TensorFlow and deep learning in Biology?

Deep learning refers to a method, or an algorithm a computer uses to understand patterns and analyze data. Programs like TensorFlow can be used to build networks of neurons. Those neurons are organized into many complex layers and through complex pathways process data. The most common use of such techniques is image recognition.

A machine is given labeled data and processes it. This could be pictures of cats labeled “cat” and pictures of dogs labeled “dog”. The computer will analyze each pixel, each neuron of the network will try to determine from each pixel if this picture shows a dog or a cat. It will try and in the end it will check if the answer is correct. If not it will try again and again until it finds the best way to recognize cats and dogs. The more images you provide the better the network will become.

Another way such a technology may be used is to find patterns within unlabeled data. How this works is a bit simpler. You provide large data sets to a neural network and let it sort everything. It will attempt to find any patterns and will categorize the data according to what it sees. Such a thing would be better done by a human who knows what to look for in a dataset. But when you have millions of variables then a machine can be much faster and end up accelerating a humans workflow.

So this is why all those companies invest in deep learning, because as experimental data gets easier to produce, analyzing it becomes a challenge. The fastest someone manages to analyze data and produce research and patents, the faster profit and investments will come.

But as someone working in the field of biology, how can you utilize this powerful tool?


The easiest answer to this question is: analyze image data. If you work with bacteria or plants and you can use a drone or manually take picture of them, then you may be able to use deep learning to find patterns within your samples. Although if the dataset is small it may be more efficient to do it manually. Also keep in mind that you need to find an initial dataset to train the network. This technology is a solution for large datasets, not a replacement for manual data analysis.

Another way to use such a technology is for bioinformatics. If you have access to sequencers and multiple assays for thousands of individuals then this may be perfect for you. Sequencing even 1-2 thousand bases of DNA from 1000 people (or animals) might take you forever to analyze. And normal algorithms might miss new, undocumented patterns or SNPs. With a deep learning neural network you can find patterns that would take a human weeks or even months in just a few days depending on your processing power.

What about that processing power though? Do i need a supercomputer?

I guess you might want one… but unless you do something crazy with incredibly large data sets (in which case you wouldn’t be reading this article), not really. With a computer that costs 2-4 thousand pounds you can solve such problems in a few days. The focus is on the GPU, which is usually much better in those tasks than a normal processor.

If you are already working with such tech you don’t need all that info but since you clicked this article you probably study biology and wonder if you will ever need to know about this or you are working in a lab and would like to accelerate your workflow. In any case i hope i gave you some basic information. I would recommend learning more about this tech from the sources below, since it may be useful to you someday. For now it is complicated and requires programming knowledge but i am sure it will soon become easier to use.

