This paper presents performance results for parallel algorithms that compute the longest common subsequence of two strings. This algorithm is a representative of a class of algorithms that compute string to string distances and has computational complexity
). The parallel algorithm uses a variable grid size, runs in
) supersteps (synchronization phases) and has linear communication costs. We study this algorithm in BSP context, give runtime estimations and compare the predictions to experimental values measured on three different parallel architectures, using different BSP programming libraries and an efficient implementation for sequential computation. We find that using the BSP model and the appropriate optimized BSP library improves the performance over plain MPI, and that scalability can be improved by using a tuned grid size parameter.