We, as a biodiversity informatics community, use quite a few programming
languages. Python, Java, R, Ruby are probably the most popular ones. After
working for the last few years with a language Go I think it would be very
beneficial to add it to the biodiversity informatics toolset.
When I started writing in Go, the language seemed to be a bit clunky, did not
have much of cool shiny language constructs that other modern languages have. I
felt there were no “cute” syntax gold nuggets that are common in languages like
Ruby or Python. However, I think Go is amasing for solving many biodiversity
informatics problems.
I tried about 10 different languages for the last 12 years working on
Encyclopedia Of Life and Global Names projects, and for the last 5 years Go
is the language of choice for the vast majority of projects I make.
Lets look at the key features of the language that influenced my choice.
Feature 1: Amazing code readability
Biodiversity informatics runs mostly in academia. In academia projects
blossom and wither depending on availability of funds. Funds often appear and
dissapear, students and postdocs come and go, and quite often people have to
continue development of projects written by someone else.
Therefore, an ability to understand a code written by another person is crucial
for longevity of academia projects. So many important and interesting
developments become stale or died out only because their code was too hard to
understand by newcomers.
I found that code written in Go is one of the easiest to understand. The
simplicity of syntax was one of major design goals in Go, and most of the common
programming tasks can be solved only in one way. Go designers decided on a
brutally minimalistic approach, as a result Go has very little bloat and in vast
majority of cases has no duplication of features between its syntactic
constructs.
As a result a programming style of novice and experienced programmers does not
differ that much. It is great for supporting projects written by others, and for
learning from Go code itself how to solve common problems.
Feature 2: Go is easy to learn
Minimalistic approach to the language design makes it possible to learn Go with
5-20 times less effort than other languages. Specification of the language is
tiny and just going through an ‘official’ Go tutorial is enough to
become productive after a few hours.
Feature 3: Go is easy to maintain
Developers of the language released Go v1.0 in 2012 and plan to support backward
compatibility for many years to come. Starting with Go v1.11 there is also
support for versioning of community packages as dependencies. As a result
it is possible to write a program or library and use it without any changes
inspite of new versions of Go appearing regularly. If a library or a program
depends on Open Source packages, the specific version of each package can be
set in a go.mod
file.
New versions of Go appear regularly, and to use recently added features a
developer can provide Go version in go.mod
.
Go team is very careful when they consider new features, and such features are
implemented only after much thought and discussion. As a result even massive new
functionalities do not create backward incompatibilities.
Feature 4: Go is fast
Go is much faster than such languages like Python, Ruby, Perl or R. In addition
concurrency and parallel execution of the code is a core concept of Go.
Writing concurrent code in Go is orders of magnitude easier than
in C, and after some practice becomes a second nature of a developer.
The combination of the language speed and parallel execution of the code on
multi-core CPUs allows to make Go programs up to 100 times faster than programs
written in interpreted languages. Go is somewhat slower than C, but Go programs
are often faster than analogous C programs, because of the ease of developing
concurrency and parallelism in Go.
Feature 5: Speed of programming is great
Go is quite expressive, and, in my experience, the speed of developing in Go
is comparable with speed of development in Python or Ruby.
Also, running tests in Go is very fast, even big programs
can be tested without a long wait. It allows to run tests often, or run
them on each save.
Feature 6: Very fast compilation
Go is a compiled language, and requires compilation of its code before
execution. The speed of compiling the code is usually almost instantaneous,
and the language developers spend a lot of effort to keep it this way. The
speed of compiling is only slightly slower than executing Python or Ruby
code.
Feature 7: Convenient executable files
As a rule, Go compiles a program into a self-sufficient single executable file.
Most of Go programs have no external dependencies and, as a result, are very
easy to distribute and install. Downloading one file and running it is all what
is required. Go supports cross-compilation. It means that it can create
executable files for any supported OS on one computer. For example, a computer
running Linux can create executables for MS Windows, Mac OS, and Linux in one
go.
The size of the executable files is really tiny. For example, the size of a
name-finding project written in Ruby (GNRD) is about a gigabyte, while the
analogous code in Go is only 50 megabytes.
Small size of executables and lack of external dependencies make Go fantastic
for publishing projects as Docker images, or deplying such images on Kubernetes.
Feature 8: Go is great for remote APIs and web-applications
Writing an extemely fast web-server in Go is a trivial task. Developing
web-applications is quite easy with provided template methods. Distributing
web-applications is also easy, because all the static files are usually
included into a single binary. Besides a traditional REST approach to APIs
there is a very fast streaming gRPC approach.
Feature 9: Rich ecosystem of Open Source libraries
Inspite of being a relatively young language (v1.0.0 was released in 2012) Go
has a very active community and a large number of libraries for many development
needs. Specific libraries for biodiversity informatics are scarse so far, but
they start to appear. For example Global Names provides libraries for finding,
parsing and verifying of scientific names.
Feature 10: Go code can be used in many other languages via C-bindings
It is possible to compile Go into a C library, and use it via C-binding with
many other languages (C, R, Ruby, Python, Java for example).
It is also possible to incorporate C libraries into Go, however most of
functionality usually provided by C libraries is already implemented in pure Go
and, most of the time, introducing such depencencies is not required. For
example there are fantastic Go drivers for most popular databases.
A lot of tools exist for Go that simplify development. Go creators pioneered an
idea of auto-formatting of a written code. There are fantastic plugins for Go
development in VS code, Vim, Emacs etc. JetBrains releases a standalone Go
development platform. Go plugins incorporate linting, formatting,
debugging, refactoring tools. Most of these tools can also be used as a
stand-alone command line applications. Go has powerful profiling and tracing
tools as well.
Conclusion
I hope this post gave you an idea why Go is good for biodiversity informatics,
and you can try to start solving problems that appear in your work using Go
language. I would suggest to pick a small well-defined task, that requires fast
execution, read the tour of Go tutotiral, install Go and its tools for your
favorite editor and start hacking the code!