php - Full Text Search - Ignore polish letters in search phrase -
is possible ignore polish characters in search phrase using full text search? example, have movie in database. movie's title is: "pięćdziesiąt twarzy greya". if visitor searches phrase: "piecdziesiat", script should find movie contain phrase: "piecdziesiat" (and ignore polish letters).
is possible?
you can use strtr()
convert string diacritics string without diacritics. example, can convert 'pięćdziesiąt' 'piecdziesiat'. there's comment on php documentation page useful function containing translation table.
for posterity's sake, is:
function normalize ($string) { $table = array( 'Š'=>'s', 'š'=>'s', 'Đ'=>'dj', 'đ'=>'dj', 'Ž'=>'z', 'ž'=>'z', 'Č'=>'c', 'č'=>'c', 'Ć'=>'c', 'ć'=>'c', 'À'=>'a', 'Á'=>'a', 'Â'=>'a', 'Ã'=>'a', 'Ä'=>'a', 'Å'=>'a', 'Æ'=>'a', 'Ç'=>'c', 'È'=>'e', 'É'=>'e', 'Ê'=>'e', 'Ë'=>'e', 'Ì'=>'i', 'Í'=>'i', 'Î'=>'i', 'Ï'=>'i', 'Ñ'=>'n', 'Ò'=>'o', 'Ó'=>'o', 'Ô'=>'o', 'Õ'=>'o', 'Ö'=>'o', 'Ø'=>'o', 'Ù'=>'u', 'Ú'=>'u', 'Û'=>'u', 'Ü'=>'u', 'Ý'=>'y', 'Þ'=>'b', 'ß'=>'ss', 'à'=>'a', 'á'=>'a', 'â'=>'a', 'ã'=>'a', 'ä'=>'a', 'å'=>'a', 'æ'=>'a', 'ç'=>'c', 'è'=>'e', 'é'=>'e', 'ê'=>'e', 'ë'=>'e', 'ì'=>'i', 'í'=>'i', 'î'=>'i', 'ï'=>'i', 'ð'=>'o', 'ñ'=>'n', 'ò'=>'o', 'ó'=>'o', 'ô'=>'o', 'õ'=>'o', 'ö'=>'o', 'ø'=>'o', 'ù'=>'u', 'ú'=>'u', 'û'=>'u', 'ý'=>'y', 'ý'=>'y', 'þ'=>'b', 'ÿ'=>'y', 'Ŕ'=>'r', 'ŕ'=>'r', ); return strtr($string, $table); }
so user searches "pięćdziesiąt" , turn "piecdziesiat" (you can run through strtolower()
well). in database have field 'canonicalised version', has stripped diacritics. when search in database, search on canonical field instead of title field.
another option depends on database you're using. postgresql has unaccent
feature lets on database side, without need 'canonical' field. there's utf8-bin
mode in mysql / mariadb, , i'm pretty sure mongodb has similar function.
Comments
Post a Comment