# A Biclique Approach to Reference Anchored Gene Blocks and Its Applications to Pathogenicity Islands Conference Paper

•
• Overview
•
• Research
•
•
• View All
•

### abstract

• We formalize a new problem variant in gene-block discovery, denoted Reference-Anchored Gene Blocks (RAGB). Given a query sequence Q of length n, representing the gene-array of a DNA element, a window size bound d on the length of a substring of interest in Q, and a set of target gene sequences $$\mathcal {T}=\{T_1 \ldots T_c\}$$. Our objective is to identify gene-blocks in $$\mathcal {T}$$ that are centered in a subset q of co-localized genes from Q, and contain genomes from $$\mathcal {T}$$ in which the corresponding orthologs of the genes from q are also co-localized. We cast RAGB as a variant of a (colored) biclique problem in bipartite graphs, and analyze its parameterized complexity, as well as the parameterized complexity of other related problems. We give an $$O(nm+2^dnm/\lg m)$$ time algorithm for the uncolored variant of our biclique problem, where m is the number of areas of interest that are parsed from the target sequences, and n and d are as defined above. Our algorithm can be adapted to compute all maximal bicliques in the graph within the same time complexity, and to handle edge-weights with a slight $$O(\lg d)$$ increase to its time complexity. For the colored version of the problem, our algorithm has a time complexity of $$O(2^dnm)$$. We implement the algorithm and exemplify its application to LEE, a well-known pathogenicity island from the e. coli genome harboring virulence genes. Our code and supplementary materials, including omitted proofs and figures, are available at https:// www. cs. bgu. ac. il/ ~negevcb/ RAGB/ .

### publication date

• January 1, 2016