Homework 10: Maximal Common Subsequence

This problem was sent to me by Luther W., one of our members. I have paraphrased a little to clarify the assignment.

Given two sequences of strings, find the longest common subsequence where the strings of the subsequence occur in both given sequences in the same order.

Example:

s1 = A Q 1 9 3 7 8  J 4 6 5 2 K 10
s2 = A K 7 5 J 1 9 10 4 2 8 6 Q  3

The maximal common subsequence is A 1 9 4 6. This subsequence occurs in order in both given sequences.

The given sequences do not have to be the same length, and they can have repeated strings. The solution may not be unique if two or more common subsequences have the same length; in that case, providing only one of them is sufficient.

This is the most worked-on problem in DNA analysis.

You may do this using the language of your choice. I encourage you to solve it in a language you would like to learn, rather than one you are an expert in, to broaden your horizon.

Try to avoid brute-force solutions where possible.

Bring your solution to our August meeting.

Feel free to discuss the problem or any other topics you are interested in on our mailing list (accessible via our Meetup Group).