Multiple Sequence Alignment (MSA) is perhaps second only to sequence alignment in overall importance in Bioinformatics, being critical, e.g., in determining the structure and function of molecules from putative families of sequences. But while pairwise sequence alignment has been the subject of scores of FPGA acceleration studies, MSA only a few. The most important of these accelerate Clustal-W, the most commonly used MSA code, by either implementing the first of three phases (over 90% of the run time) with Dynamic Programming (DP) methods, or by accelerating the third phase which consumes most of the remaining time. We use a new approach: we apply prefiltering of the kind commonly used in BLAST to perform the initial all-pairs alignments. This results in a speedup of from 80x to 190x over the CPU code (8 cores) and speedup of from 2.5x to 8x over DP/FPGA- and GPU-based methods. When combined with a recently published method for phase 3, and using the original software for phase 2, the end-to-end speedup is at least 50x over an 8-core implementation of the original code. The quality is comparable to the original according to a commonly used benchmark suite evaluated with respect to multiple distance metrics.