Category: Bioinformatics

Fastq to Fasta

Inspired by solutions posted at biostar, here is my contribution to this simple sequencing file format conversion.

## fastq_to_fa.py
import sys
from Bio.SeqIO.QualityIO import FastqGeneralIterator
for title, seq, qual in FastqGeneralIterator(sys.stdin) :
 print(">%s\\n%s" % (title,seq))

Usage:

 cat reads.fastq | fastq_to_fa.py

If you need to simply extract sequences…that is also simple!

## flattenFastq.py
for title, seq, qual in FastqGeneralIterator(sys.stdin) :
    print("%s" % (seq))

Circular Ideogram

Displaying data in circular fashion is increasingly getting attractive and a great way to make art out of data!

Here I show one of the many ways to make Circos style Ideogram plot of point genomic data with few genomic features. The goal here is to create a cheesy visualization of association between HIV or MLV integration sites to genes & GC content. The data and code is available at my Github repository.

Ciros_plot

Bioinformatics Art

During my undergrad, I did research in Dr. James Pierce’s lab working on the horseshoe crab genome. Below are the logos I created for a website which was to host the cDNA library from BACs.


Shortly after starting the grad school, hopes of progressing ahead were dim. To boost the spirits and aim for the finish line, I created team Bioinformatics t-shirts.


After wheedling science gods we all passed grad school and I landed a job at UPENN in Rick Bushman’s lab. Great majority of my work involved looking at next-gen data produced by 454 sequencer.

454animation

The secure website of the lab at the time lacked a bit of color and creativity so I created few things to setup the moods for analysis.

webfrontendbushmanlab