Randomized Algorithm for motif Detection Lusheng Wang Dept of Computer Science City University of Hong Kong Web:http://www.cs.cityu.edu.hk/wlwang e-mail: Iwang @cs. cityu. edu.hk 2021/1/29
2021/1/29 1 Randomized Algorithm for Motif Detection Lusheng Wang Dept. of Computer Science City University of Hong Kong Web: http://www.cs.cityu.edu.hk/~lwang e-mail: lwang@cs.cityu.edu.hk
Outline Review our theoretical results 2. A software for motif detection 3. Other problems we have been working Parametric tools RNA secondary structure comparison Computing translocation distance between genomes 2021/1/29
2021/1/29 2 Outline: 1. Review our theoretical results 2. A software for motif detection 3. Other problems we have been working --Parametric tools RNA secondary structure comparison --Computing translocation distance between genomes
Distinguishing Substring Selection Problem Instance: Given a set B of n, (bad) strings of length at least L, a setG of n,(good) strings, two thresholds db and < Objective: Find a string s such that for each string b E B there exists a substring t of b with lengthl such that d(S,1)≤db and for any string g∈G, (S,g)≥ Bad sequences s: motif Good sequences 2021/1/29
2021/1/29 3 Distinguishing Substring Selection Problem Instance: Given a set B of n1 (bad) strings of length at least L, a set G of n2 (good) strings, two thresholds db and dg ( db < dg ). Objective: Find a string s such that for each string b B there exists a substring t of b with length L such that d(s, t) db and for any string g G, d(s, g) dg . Bad sequences Good sequences s: motif
Applications 1. Finding targets for Potential drugs (T Jiang, C. Trendall, S, Wang, T. Wareham, X. Zhang, 98) (K. Lanctot, M. Li, B Ma, S. Wang, and L. Zhang 1999) bad strings in b are from bacteria --good strings are from Humans find a substring s of length L that is conserved in all bad strings, but not conserved in good strings use s to screen chemicals those selected chemicals can then be tested as potential broad-range antibiotics 2021/1/29
2021/1/29 4 Applications 1. Finding Targets for Potential Drugs (T. Jiang, C. Trendall, S, Wang, T. Wareham, X. Zhang, 98) (K. Lanctot, M. Li, B. Ma, S. Wang, and L. Zhang 1999) -- bad strings in B are from Bacteria. --good strings are from Humans -- find a substring s of length L that is conserved in all bad strings, but not conserved in good strings. -- use s to screen chemicals -- those selected chemicals can then be tested as potential broad-range antibiotics
Applications 2. Creating Diagnostic Probes for Bacterial Infection (T. Brown, G.A. Leonard, E.D. Booth, G. Kneale, 1990) a group of closely related pathogenic bacteria find a substring that occurs in each of the bacterial sequences(with as few substitutions as possible) and does not occurs in the host(Human) 2021/1/29
2021/1/29 5 Applications 2. Creating Diagnostic Probes for Bacterial Infection (T. Brown, G.A. Leonard, E.D. Booth, G. Kneale, 1990) -- a group of closely related pathogenic bacteria -- find a substring that occurs in each of the bacterial sequences (with as few substitutions as possible) and does not occurs in the host (Human)