Codepage 437 Converter
What is this?
This tool converts text between Codepage 437 and Unicode.
During the BBS era much online art was made with text using the "high ASCII" character set technically known as the OEM Characterset or Codepage #437 (CP437). See textfiles.com for some classic ASCII & ANSI art or check out 16colo.rs for old and new textual art pack releases.
Today's dominant character set is Unicode and UTF-8 is its most popular storage format. UTF-8 is binary compatible with the first 128 characters of the old ASCII codepage (#0 to #127). Simple ASCII text files can be read as UTF-8 as long as they don't use the "extended" characters #128 to #255 (AKA "high ASCII"). Unicode includes codepoints for representing the "high ASCII" chars of CP437. However, the Unicode codepoints are not the same binary values as CP437.
wget http://artscene.textfiles.com/ansi/artwork/belinda.ans cat belinda.ans
After Unicode conversion:
How Do I Use It?
Open an ASCII / ANSI art file that was saved using "High ASCII" of Codepage 437 in a text editor or web browser. You could point your browser at one of the .ANS files on TextFiles.com. For example, open belinda.ans in your browser (or download: rt-click, save link as). In a browser or text editor the CP437 files will likely default to ISO-8859-1 encoding and display stuff like "°²Ü" instead of "░▓▄". You can manually set the encoding to ISO-8859-1 in most decent editors.
Either Copy and paste the ISO-8859-1 text into the
Input field above,
or use the
→ Load CP437 File ← button to load the saved file. Then press
the button labeled
▼ Convert to UTF-8 ▼ .
The default converter settings preserve the char codes for carriage return, linefeed
and tab (Cr/Lf/Tb), as well as ANSI escape codes
This allows conversion of most "Low ASCII" (#0 - #31) control codes to their visual
representations in Unicode. Glyphs for card suites (♥♦♣♠), musical notes (♪♫), arrows (↑↓↕←→↔), happy faces (☺☻),
etc. will be visible in the Output field, but the Cr/Lf/Tb and ANSI escapes will remain
as control codes so that ANSI art can be viewed on an ANSI aware terminal.
Disable preservation of ANSI control codes and char #27 will be converted to unicode
even when followed by a bracket. This prevents ANSI aware terminals from interpreting the escapes as control codes.
I find this useful when writing documentation about ANSI escape codes.
All of the CP437 control codes have visual symbols. When
is disabled the newline chars and tab are converted to the visual representations (♪◙○).
This can be useful when converting character raster data, such a memory dump from a text based game.
Preserve Controls does not convert any of the control code characters (#0 to #27).
This may be desirable when ANSI uses vertical tab or other control characters. When this option
is enabled it overrides the Cr/Lf/Tb and Esc options, i.e., disabling those will not disable
preservation of the carriage return, linefeed, tab or escape control codes.
Converting from Unicode to CP437
→ Load UTF-8 File ← button or paste Unicode text in the
Input box then hit the
▼ Convert to CP437 ▼ button. If you
already had Unicode text in the
Output box you can hit the
▼▲ Swap In & Out ▲▼
button instead to move it to the
When converting to CP437 the preservation settings are ignored. The codepoints for happy faces, card suites, arrows, etc. will be mapped back into the CP437 glyphs; Control codes (#0 to #31) including carriage return, linefeed, tab, etc. also become the CP437 equivalents.
Conversion into CP437 can be considered as a "lossy" operation as glyphs once again share character numbers with control codes; Whereas in Unicode they were separate codepoints. Any Unicode codepoint that is not directly translatable to CP437 will be stripped from the output upon conversion.