3v4l.org

run code in 300+ PHP versions simultaneously
<?php // Set internal encoding to UTF-8 $t = mb_internal_encoding('UTF-8'); echo "Internal Encoding set? ".($t?'yes':'no')." and it is ".mb_internal_encoding()."\n\n"; // these are text copied from https://www.lipsum.com and is UTF-8 $tr = "Lorem Ipsum, dizgi ve baskı endüstrisinde kullanılan mıgır metinlerdir. Lorem Ipsum, adı bilinmeyen bir matbaacının bir hurufat numune kitabı oluşturmak üzere bir yazı galerisini alarak karıştırdığı 1500'lerden beri endüstri standardı sahte metinler olarak kullanılmıştır. Beşyüz yıl boyunca varlığını sürdürmekle kalmamış, aynı zamanda pek değişmeden elektronik dizgiye de sıçramıştır. 1960'larda Lorem Ipsum pasajları da içeren Letraset yapraklarının yayınlanması ile ve yakın zamanda Aldus PageMaker gibi Lorem Ipsum sürümleri içeren masaüstü yayıncılık yazılımları ile popüler olmuştur."; $encodingtr = "ISO-8859-9"; $cz = "Lorem Ipsum je demonstrativní výplňový text používaný v tiskařském a knihařském průmyslu. Lorem Ipsum je považováno za standard v této oblasti už od začátku 16. století, kdy dnes neznámý tiskař vzal kusy textu a na jejich základě vytvořil speciální vzorovou knihu. Jeho odkaz nevydržel pouze pět století, on přežil i nástup elektronické sazby v podstatě beze změny. Nejvíce popularizováno bylo Lorem Ipsum v šedesátých letech 20. století, kdy byly vydávány speciální vzorníky s jeho pasážemi a později pak díky počítačovým DTP programům jako Aldus PageMaker."; $encodingcz = "ISO-8859-2"; $fr = "Le Lorem Ipsum est simplement du faux texte employé dans la composition et la mise en page avant impression. Le Lorem Ipsum est le faux texte standard de l'imprimerie depuis les années 1500, quand un imprimeur anonyme assembla ensemble des morceaux de texte pour réaliser un livre spécimen de polices de texte. Il n'a pas fait que survivre cinq siècles, mais s'est aussi adapté à la bureautique informatique, sans que son contenu n'en soit modifié. Il a été popularisé dans les années 1960 grâce à la vente de feuilles Letraset contenant des passages du Lorem Ipsum, et, plus récemment, par son inclusion dans des applications de mise en page de texte, comme Aldus PageMaker."; $encodingfr = "ISO-8859-1"; check('turkish', $tr, $encodingtr); check('czech', $cz, $encodingcz); check('french', $fr, $encodingfr); function check($lang, $str, $encoding) { echo "------- checking $lang $encoding -----------\n"; // Will convert the text to desired encoding regardless of internal encoding from UTF-8 // this is the text that we supposedly received from the client $s = mb_convert_encoding($str, $encoding, 'UTF-8'); // Check if the string is valid UTF-8 $t = mb_check_encoding($s, 'UTF-8'); echo "Valid UTF-8? ".($t?'yes':'no')." / "; // Check if the string is valid ISO-8859-9 $t = mb_check_encoding($s, 'ISO-8859-9'); echo "Valid ISO-8859-9? ".($t?'yes':'no')." / "; // Check if the string is valid ISO-8859-2 $t = mb_check_encoding($s, 'ISO-8859-2'); echo "Valid ISO-8859-2? ".($t?'yes':'no')." / "; // Check if the string is valid ISO-8859-1 $t = mb_check_encoding($s, 'ISO-8859-1'); echo "Valid ISO-8859-1? ".($t?'yes':'no')."\n"; // Convert string between same encodings while internal encoding is UTF-8 - should not do anything $e = mb_convert_encoding($s, $encoding, $encoding); // Convert to UTF-8 from given encoding $e2 = mb_convert_encoding($e, "UTF-8", $encoding); //echo "Output it to UTF-8 so we can see here: $e2\n"; // How can I get to here without knowing that the original text is in - I need to detect it // not a good method but I need to give a list of encodings based on my customer preferences // if I am in Japan need a different set probably $t = mb_detect_order('ASCII,UTF-8,ISO-8859-9,ISO-8859-2,ISO-8859-1,ISO-8859-15'); $detected_encoding1 = mb_detect_encoding($s); $e3 = mb_convert_encoding($e, "UTF-8", $detected_encoding1); // echo "Detected encoding is: ".$detected_encoding1."\n"; $detected_encoding2 = mb_detect_encoding($s, 'ASCII,UTF-8,ISO-8859-9,ISO-8859-2,ISO-8859-1,ISO-8859-15'); // echo "Detected encoding is: ".$detected_encoding2."\n"; if ($detected_encoding1 !== $detected_encoding2) { echo "Detected encodings differ between mb_detect_order and list given $detected_encoding1 != $detected_encoding2\n"; } if ($e2 === $e3) { echo "** MATCH we are good!\n"; } else { echo "** SOMETHING IS BROKEN!\n"; } echo "\n----------------------------------------------------------------\n\n"; }
Output for 7.4.0 - 7.4.33, 8.0.1 - 8.0.30, 8.3.0 - 8.3.25, 8.4.1 - 8.4.12
Internal Encoding set? yes and it is UTF-8 ------- checking turkish ISO-8859-9 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** MATCH we are good! ---------------------------------------------------------------- ------- checking czech ISO-8859-2 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** SOMETHING IS BROKEN! ---------------------------------------------------------------- ------- checking french ISO-8859-1 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** MATCH we are good! ----------------------------------------------------------------
Output for 8.1.8 - 8.1.33, 8.2.0 - 8.2.29
Internal Encoding set? yes and it is UTF-8 ------- checking turkish ISO-8859-9 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** SOMETHING IS BROKEN! ---------------------------------------------------------------- ------- checking czech ISO-8859-2 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** MATCH we are good! ---------------------------------------------------------------- ------- checking french ISO-8859-1 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** MATCH we are good! ----------------------------------------------------------------
Output for 8.1.0 - 8.1.7
Internal Encoding set? yes and it is UTF-8 ------- checking turkish ISO-8859-9 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** SOMETHING IS BROKEN! ---------------------------------------------------------------- ------- checking czech ISO-8859-2 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** SOMETHING IS BROKEN! ---------------------------------------------------------------- ------- checking french ISO-8859-1 ----------- Valid UTF-8? no / Valid ISO-8859-9? yes / Valid ISO-8859-2? yes / Valid ISO-8859-1? yes ** MATCH we are good! ----------------------------------------------------------------

preferences:
100.61 ms | 411 KiB | 5 Q