Nonprintable ASCII Characters And Your Data
- Posted in:
- Development
- SSMS
- T-SQL
The ASCII character set is comprised of 128 characters. Many of them are nonprintable. When those characters appear within your data, it can get a little confusing. Especially when querying the data with SSMS, or outputting it to a file, or... Well, you get the idea.
The specific ASCII codes that are nonprintable are the ones from 0 to 31 and 127. We can "see" what those characters look like a few different ways with SSMS. Here is a query to run:
;WITH NonPrintableAsciiCodes AS
(
SELECT 0 AS IntCode
UNION ALL
SELECT 127 AS IntCode
UNION ALL
SELECT IntCode + 1
FROM NonPrintableAsciiCodes
WHERE IntCode < 31
)
SELECT IntCode,
CHAR(IntCode) CharValue,
'%' + CHAR(IntCode) + '%' EnclosedChar,
'%' + RTRIM(CHAR(IntCode)) + '%' EnclosedTrimmedChar
FROM NonPrintableAsciiCodes
ORDER BY IntCode
Below is the query result with output results to grid in SSMS (CTRL + D). The first column shows the integer code of the ASCII character. The second column attempts to show the actual ASCII character. The third column shows the ASCII character enclosed in a percent % character. The last column attempts to "trim" the ASCII character, also enclosed in a percent % character: this demonstrates that what may appear to be a space character really isn't one.
Here is the same query result, this time with SSMS results to text (CTRL + T). Notice the integer code 9 is significantly out of alignment with the others, while integer codes 10 and 13 spill onto multiple lines:
Integer codes 9, 10, and 13 correspond to Tab, Line Feed, and Carriage Return characters respectively. If I encounter them within SQL Server data, I generally leave those intact. Although it's fair to question why formatting characters are stored in a relational database. If any of the other nonprintable characters are discovered, I find those are usually accidental outliers. Those can be replaced, usually with a single space character or an empty string '', depending on the data.
Be wary of nonprintable characters in your data. They can be really deceptive.
Comments