Using makefiles in bioinformatics pipelines

Recently I am involved in many projects where parsing text files is necessary, and I use small scripts to archive the task. However the code should support pipelines and I found makefiles very handy.

Makefiles are efficiently used in software development to define a ruleset to automatically build programs and they have been used since the end of ‘70-es. The command called make reads makefiles (named: makefile or Makefile) and executes the commands within and produces the desired build. Moreover make can also invoke scripts, therefore it is a very handy utility in bioinformatics.

A makefile consists of multiple rules, and these rules define what components are needed to create the target using a script. The rules can be compared to a recipe in a cookbook, what ingredients are needed to cook a food, and the actual recipe the script which “compiles” our ingredients into the food on the table.

fried_chicken: raw_chicken oil

Formally speaking:

target1 [target2 ...]: [component1 component2 ...]
	[<TAB>command 1]
	[<TAB>command 2]

On the left side the targets need to be defined, on the right side just after the colon the necessary components need to be stated, however they are not necessary e.g.: creating a file by just “touching it”. Afterwards the commands are listed and they will create the targets. Usually make’s basic interpreter executes commands by using Unix’s default shell, the /bin/sh , so cat, cp, rm etc… commands can be invoked.

Another nice feature of make that a target can be a component. For example:

dinner: fried_chicken baked_potatoes

fried_chicken: ...

baked_potatoes: ...

Here if we issue make dinner , make will check whether fried_chicken and baked_potatoes exist, if not it will call those rules as well.

There are two common targets: all and clean. Programmers define all target to create every target, while clean  is responsible of launch an rm command to clean up the build environment.

There is another advantage of makefiles. Let us assume that that we have an all target and one of the components has been updated. (E.g.: A newer source file has been downloaded from the internet and it has a newer timestamp). After we issue make all again, it will discover that component is newer and call any target where that component has been listed. This feature allows to build up pipelines.

Before executing make, we may be interested what will be done.

make –n target1 target2 ...

Calling make with –n will show what commands will be issued upon a real execution.

I hope that, this article gave some brief introduction to make. There are a few links about make that I found useful:

Advanced Makefile Tricks – it is described here how to use special macros. This is very useful e.g.: passing components as arguments for the commands, pipeing output to the target etc.

Make (software) – Wikipedia entry about make where its history described and some examples are shown.

Price of cheap solutions: using home wireless routers in a business environment

Recently I spent some days in different hotels in Hungary and I was quite surprised that at many places, routers designed for home usage were set up to satisfy the needs of tons of users. The beloved manufacturer was TP-LINK and I was shocked on the solutions that I have seen:
1. The routers were connected in a daisy link structure with ethernet cables instead of star structure.
2. Many amateur so called network specialist do not know that same SSID can be assigned by multiple routers. In other to minimize conflict of channels it is thought that giving different SSID is a solution to the problem instead of cleverly assigning the radio channels. This is very inconvenient, because the user has to connect separately to every router.


3. TP-LINK is perfect for home and small business usage, however I would consider it twice before deploying for hundreds of users.

I am sure that such a solution is very cheap to set up initially, however the network might tend to break down and the work hours of a network specialist are expensive. This means that many CEO do not plan simply with a long term solution, but giving a kind of connection without taking into account that router errors are responsible for frustrated guests.

Interesting fact – Euro denominated bank deposit interest rate is higher in Hungary

It is obvious that putting my savings into a bank deposit in Germany is worthless since the interest rate is around 1%. Exchanging to Hungarian forints also does not make sense, because the exchange rate is unreliable and the currency got roughly 20% weaker compared to Euro than one year ago.

However I discovered that most of the banks offer to open an Euro based account and the interest rate is about 5-6%. And the costs? Roughly 1-2 Euro/month/account and a 16% tax on the interest. I think it is worth to keep the savings in my home country.

Don’t use client side scripting for Googlebot

Recently I have been involved in creating VIZBI website. The VIZBI initiative is an international conference series bringing together researchers developing and using computational visualization to address a broad range of biological research areas.

We have been using jQuery extensively on the website, however there was one drawback: compatibility Googlebot. Unfortunately it cannot interpret JavaScript, therefore data retrieved by client side, could not be parsed. We have redesigned the video page and the text and data is shown by PHP, and now it can parsed by Google as it is shown here:


Meanwhile we still use jQuery and plugins built upon it in order to arrange and visualize the elements on the webpage in a nice way. The following links helped us to have Googlebot crawled our website.

Vimeo SEO: How To Get Embedded Vimeo Videos Into

Tricky situation: video sitemaps for external videos + JS overlay

Presentation tools and tips

Since giving presentation is an essential skill, I would like to share some technical ideas to make one more successful on the stage. These features exists in the common presentation programs e.g.: PowerPoint, Keynote etc.

  1. Aim for 1024×768 resolution
  2. Modern displays are capable of higher  resolutions, however most of the beamers are limited to 1024×768. This means that the figures in the slides can get pixelated, small and crappy. Here it is shown, where the resolution can be changed and also the presenter view can be turned on.

  3. Use Presenter View (for reading the notes and checking the time)
  4. Nothing is more annoying during a presentation when the speakers talks for 50 minutes instead of 30 minutes as it was scheduled. Moreover it is a common error that speakers use slides as note for themselves and reading them loudly. Presenter View helps to solve this issue by showing notes, elapsed time on the speakers monitors, and the next slides.
    In order to access this functionality, mirroring has to be turned off in display settings, and extended display option has to be chosen. More info here: Windows 7, Mac OSX

  5. Zooming on text
  6. I am used to attend coding and hacking session, where the size of text is important. Hopefully modern operating systems include a built-in tool for zooming. E.g. Windows XP/7 has the Magnifier tool by default:
    While Mac OSX uses the following key combinations for launching the zooming functionality:
    However I recommend for Windows users a tool called ZoomIt. As addition to the default zooming capabilities, it allows the presenter to write anywhere on the screen or to draw on the presentation. Very handy!

  7. Give yourself 15 minutes to set up the laptop and a the display

Compatibility between different technical equipment is not always perfect. It is better to reveal the technical issues sooner, than to solve them in front of the audience.