Table of Contents
ToggleIntroduction
Encoding issues cause odd characters when a subtitle file is read with the wrong character set. Subtitle Edit includes powerful tools that diagnose and fix these problems by enforcing universal character standards.
Use UTF-8 correctly to prevent gibberish characters from corrupting your work and to ensure worldwide readability.
What Causes Encoding Problems in Subtitle Files?

Encoding conflicts mainly arise when a subtitle file is transferred between different operating systems or when it is saved using an outdated standard.
ANSI encoding only supports Western European languages. Using it with other languages causes unreadable characters.
Another cause is mismatched system locales. If you edit a Turkish subtitle file on a Japanese Windows system, the system’s default settings might misinterpret the Turkish characters. This happens because the system’s default character map prioritizes its own local alphabet over the file’s content.
How to Load the Subtitle File with Correct Encoding
The first step in fixing an encoding issue is telling Subtitle Edit exactly which character set to use when opening the file.
When you open a file via drag-and-drop, Subtitle Edit attempts to guess the encoding, and it often gets it wrong when the file is complex. You must override this automatic guesswork.
Using the File > Open Menu
Instead of dragging the file into the main window, go to File > Open. Navigate to your subtitle file and select it.
Before you click “Open,” look at the bottom of the open dialog box. You will see a dropdown labeled “Encoding.” This allows you to force a specific character set.
Forced UTF-8 on Open
If the file has foreign characters, select “Unicode (UTF-8)” from the encoding list. This ensures the program reads the file correctly.
If the file is highly localized (like a traditional Chinese Big5 file), try selecting that specific encoding. After opening, the gibberish text should resolve to the correct language script instantly.
How to Convert Subtitles to Universal UTF-8 Permanently
After the file displays correctly, immediately convert it to UTF-8 for safe, universal encoding. UTF-8 is the global standard for subtitles, supporting all characters in all languages for universal compatibility.
Applying the Encoding Change
Go to File > Encoding to see the file’s current encoding. Choose UTF-8 for encoding. The text remains unchanged, but the file becomes universally compatible.
Saving as UTF-8
Go to File > Save As. In the Encoding dropdown, choose “Unicode (UTF-8)” or “Unicode (UTF-8 with BOM).”
Saving in this format permanently embeds the character set information, so media players will no longer guess the wrong encoding when opening the file.
How to Use Auto-Detect to Find Obscure Encodings
If your file still shows gibberish after trying UTF-8, it might be in a particular localized encoding. Subtitle Edit can guess obscure encodings by analyzing the character patterns in the file.
Accessing Auto-Detect
Go to File > Auto detect encoding (or simply press Ctrl+Aafter selecting the menu option).
Subtitle Edit suggests encodings by probability, letting you pick the most likely match.
Verifying the Result
If the tool suggests “Windows-1251 (Cyrillic)” with 98% certainty, select it. The text should immediately snap into the correct language.
Once you have identified the source encoding, immediately convert the file to UTF-8 using the method described above for permanent safety.
How to Fix Encoding Issues in Batch (Multiple Files)
If you have a whole season of a TV show with the same encoding problem, you can fix them all at once. Batch conversion saves time by fixing multiple files at once instead of one by one.
Using the Batch Convert Tool
Go to Tools > Batch convert…. Drag all the files with the encoding problem into the input list. In the output settings area of the Batch Converter, look for the “Output encoding” dropdown.
Set the batch tool to use UTF-8 as the output encoding. Click Convert. Subtitle Edit will process every file, ensuring that all new files are saved in the universal UTF-8 format.
How to Handle Specific Encoding Problems from Mac/Linux
Files created on non-Windows operating systems can sometimes introduce subtle encoding variations that cause issues.
Linux and Mac systems often use a pure UTF-8 encoding that lacks the Byte Order Mark (BOM), a small header Windows uses to identify the file type. While this is usually not a problem, some older Windows users may misinterpret these files.
Using UTF-8 with BOM
When saving your fixed file, always select “Unicode (UTF-8 with BOM)” as the encoding if you primarily use the file on Windows systems. The BOM serves as a clear signal to Windows, thereby enhancing compatibility.
UTF-8 is enough for most platforms, but UTF-8 with BOM is better for old Windows software.
How to Deal with Corrupted Characters After Saving
If you successfully converted the file to UTF-8 but still see question marks when viewing it outside of Subtitle Edit, the issue is not the subtitle file itself.
If the file displays correctly in Subtitle Edit, the problem lies with either your operating system’s font support or your media player’s text rendering.
Checking System Fonts
Ensure your operating system has the correct font installed to display the language (eg, Arabic, Thai, Devanagari). If the font is missing, the OS will display a box or question mark.
Adjusting the Media Player
If using VLC or MPV, open the media player’s settings. Look for the Subtitle/OSD settings and ensure that Subtitle Encoding is set to UTF-8 manually. This overrides any regional default the player might be using.
Frequently Asked Questions About Encoding Issues
Why does my foreign language subtitle file show boxes or question marks?
This happens because the file was saved with a restricted encoding issues, such as ANSI, which cannot represent foreign characters. You must open the file in Subtitle Edit, convert the encoding to UTF-8, and then save it again.
What is the difference between ANSI and UTF-8?
ANSI is an outdated, regional encoding limited to a few hundred characters, causing errors with most non-Western languages. UTF-8 is the universal modern standard supporting every character in every language, making it essential for error-free subtitling.
If I change the encoding, will the time codes change as well?
No, changing the encoding only modifies how the characters themselves are stored (the letters). The time codes and the structure of the subtitle file remain intact, so you will not lose your sync work.
What does “Auto detect encoding” do?
It analyzes the unique frequency and patterns of characters (e.g., byte patterns) in the file and suggests the most likely original encoding (e.g., Japanese Shift-JIS or Greek ISO-8859-7). This is useful when you don’t know the file’s origin.
Why does my media player still show the wrong characters even after I saved it as UTF-8?
The media player is likely overriding your file’s encoding issues. Go into the player’s settings (eg, VLC Subtitle settings) and manually set the Subtitle Encoding option to UTF-8.
Is “UTF-8” different from “UTF-8 with BOM”?
Yes, slightly. UTF-8 with BOM (Byte Order Mark) includes a small marker at the start of the file that clearly indicates it is UTF-8. It improves compatibility with older Windows programs but is often unnecessary for modern software.
How do I check the encoding of my file without opening Subtitle Edit?
You can open the file in a powerful text editor like Notepad++. Notepad++ will automatically detect and display the encoding type (e.g., ANSI, UTF-8) in the status bar at the bottom of the window.
My subtitle file is in XML/TTML format. Do I still need to worry about encoding?
Yes, all text-based formats, including XML, TTML, and SRT, rely on encoding. However, these modern formats typically default to UTF-8, so encoding issues are less common than with old SRT files.








