Skip to main content
Dave Mason - Mastodon Dave Mason - LinkedIn Dave Mason - Codeberg Dave Mason - Counter Social

Nonprintable ASCII Characters And Your Data

Dave Mason - SQL Server - ASCII

The ASCII character set is comprised of 128 characters. Many of them are nonprintable. When those characters appear within your data, it can get a little confusing. Especially when querying the data with SSMS, or outputting it to a file, or... Well, you get the idea.

The specific ASCII codes that are nonprintable are the ones from 0 to 31 and 127. We can "see" what those characters look like a few different ways with SSMS. Here is a query to run:


;WITH NonPrintableAsciiCodes AS
(
SELECT 0 AS IntCode
UNION ALL
SELECT 127 AS IntCode
UNION ALL
SELECT IntCode + 1
FROM NonPrintableAsciiCodes
WHERE IntCode < 31
)
SELECT IntCode,
CHAR(IntCode) CharValue,
'%' + CHAR(IntCode) + '%' EnclosedChar,
'%' + RTRIM(CHAR(IntCode)) + '%' EnclosedTrimmedChar
FROM NonPrintableAsciiCodes
ORDER BY IntCode

Below is the query result with output results to grid in SSMS (CTRL + D). The first column shows the integer code of the ASCII character. The second column attempts to show the actual ASCII character. The third column shows the ASCII character enclosed in a percent % character. The last column attempts to "trim" the ASCII character, also enclosed in a percent % character: this demonstrates that what may appear to be a space character really isn't one.

Here is the same query result, this time with SSMS results to text (CTRL + T). Notice the integer code 9 is significantly out of alignment with the others, while integer codes 10 and 13 spill onto multiple lines:

Integer codes 9, 10, and 13 correspond to Tab, Line Feed, and Carriage Return characters respectively. If I encounter them within SQL Server data, I generally leave those intact. Although it's fair to question why formatting characters are stored in a relational database. If any of the other nonprintable characters are discovered, I find those are usually accidental outliers. Those can be replaced, usually with a single space character or an empty string '', depending on the data.

Be wary of nonprintable characters in your data. They can be really deceptive.

Comments

Post comment