DelphiDabbler Code Snippets Database

"Encoding" Category Contents

The following snippets belong to the Encoding category.

BytesToAnsiString

Converts the given array of bytes to an ANSI raw byte string, which is returned. The returned string has the code page specified by CodePage.

function BytesToAnsiString(const Bytes: SysUtils.TBytes; const CodePage: Word):
  RawByteString;
begin
  SetLength(Result, Length(Bytes));
  if Length(Bytes) > 0 then
  begin
    Move(Bytes[0], Result[1], Length(Bytes));
    SetCodePage(Result, CodePage, False);
  end;
end;

CodePageSupportsString

Checks if the given code page supports all the characters in the given string. Returns True if all characters convert to the code page correctly or False if any character is not valid in the code page.

function CodePageSupportsString(const S: UnicodeString;
  const CodePage: Word): Boolean;
var
  Encoding: SysUtils.TEncoding;  // Encoding for required code page
begin
  Encoding := SysUtils.TMBCSEncoding.Create(CodePage);
  try
    Result := EncodingSupportsString(S, Encoding);
  finally
    Encoding.Free;
  end;
end;

EncodingSupportsString

Checks if the given encoding supports all the characters in the given string. Returns True if all characters convert to the encoding correctly or False if any character is not valid in the encoding.

function EncodingSupportsString(const S: UnicodeString;
  const Encoding: SysUtils.TEncoding): Boolean;
var
  ConvertedStr: UnicodeString;   // string converted using Encoding
begin
  // Convert S to bytes and back to unicode string using Encoding
  ConvertedStr := Encoding.GetString(Encoding.GetBytes(S));
  // If text is valid for given encoding, text and converted text must be same
  Result := S = ConvertedStr;
end;

IsASCIIChar

Checks if the given character is a valid ASCII character.

function IsASCIIChar(const Ch: Char): Boolean;
begin
  Result := Ord(Ch) <= $7F;
end;

IsASCIIDigit

Checks if the given character is a digit in the ASCII character set.

function IsASCIIDigit(const Ch: Char): Boolean;
begin
  Result := Ord(Ch) in [Ord('0')..Ord('9')];
end;

IsASCIIFile

Checks if file the named by FileName is a valid ASCII text file. BytesToCheck determines the number of bytes of the file that are to be checked. Specify 0 (the default) to check the whole file. The file is read in chunks of BufSize bytes. If this parameter is omitted, the buffer size defaults to 8Kb.

function IsASCIIFile(const FileName: string; BytesToCheck: Int64 = 0;
  BufSize: Integer = 8*1024): Boolean;
var
  Stm: Classes.TStream;
begin
  Stm := Classes.TFileStream.Create(
    FileName, SysUtils.fmOpenRead or SysUtils.fmShareDenyNone
  );
  try
    Result := IsASCIIStream(Stm, BytesToCheck, BufSize);
  finally
    Stm.Free;
  end;
end;

IsASCIIStream

Checks whether stream Stm contains valid ASCII text. Starting from the current position in the stream the next Count bytes are examined. If Count is omitted, is too large, or is <=0, the remainder of the stream is examined. The stream is read in chunks of BufSize bytes. If this parameter is omitted, the buffer size defaults to 8Kb.

function IsASCIIStream(const Stm: Classes.TStream; Count: Int64 = 0;
  BufSize: Integer = 8*1024): Boolean;
var
  StmPos: Int64;        // original stream position
  Buf: array of Byte;   // stream read buffer
  BytesRead: Integer;   // number of bytes read from stream in each chunk
  I: Integer;           // loops thru each byte in read buffer
begin
  Result := False;
  StmPos := Stm.Position;
  try
    if BufSize < 1024 then
      BufSize := 1024;
    SetLength(Buf, BufSize);
    if (Count = 0) or (Count > Stm.Size) then
      Count := Stm.Size;
    while Count > 0 do
    begin
      BytesRead := Stm.Read(Pointer(Buf)^, Math.Min(Count, Length(Buf)));
      if BytesRead = 0 then
        Exit;
      Dec(Count, BytesRead);
      for I := 0 to Pred(BytesRead) do
        if Buf[I] > $7F then
          Exit;
    end;
    Result := True;
  finally
    Stm.Position := StmPos;
  end;
end;

IsASCIIText

Checks if the given string contains only valid ASCII characters. Returns True if so or False otherwise.

function IsASCIIText(const Text: UnicodeString): Boolean;
begin
  Result := EncodingSupportsString(Text, SysUtils.TEncoding.ASCII);
end;

IsUnicodeFile

Checks if a file contains Unicode UTF16 little endian encoded text.

function IsUnicodeFile(const FileName: string): Boolean;
var
  FS: Classes.TFileStream;  // stream onto file being tested
begin
  FS := Classes.TFileStream.Create(
    FileName, SysUtils.fmOpenRead or SysUtils.fmShareDenyNone
  );
  try
    Result := IsUnicodeStream(FS);
  finally
    FS.Free;
  end;
end;

IsUnicodeStream

Checks if a stream contains Unicode UTF16 little endian encoded text at the current position.

function IsUnicodeStream(const Stm: Classes.TStream): Boolean;
var
  StmPos: LongInt;      // current position in stream
  BOM: Word;            // Unicode byte order mark
begin
  // Record current location in stream
  StmPos := Stm.Position;
  // Check if stream large enough to contain BOM (empty text file contains only
  // the BOM)
  if StmPos <= Stm.Size - SizeOf(BOM) then
  begin
    // Read first word and check if it is the unicode marker
    Stm.ReadBuffer(BOM, SizeOf(BOM));
    Result := (BOM = $FEFF);
    // Restore stream positions
    Stm.Position := StmPos;
  end
  else
    // Stream too small: can't be unicode
    Result := False;
end;

IsUTF16BEFile

Checks if the given file is a UTF-16 big-endian encoded text file by examining the byte order mark at the start of the file.

function IsUTF16BEFile(const FileName: string): Boolean;
var
  Stm: Classes.TStream;
begin
  Stm := Classes.TFileStream.Create(
    FileName, SysUtils.fmOpenRead or SysUtils.fmShareDenyNone
  );
  try
    Result := IsUTF16BEStream(Stm);
  finally
    Stm.Free;
  end;
end;

IsUTF16BEStream

Checks if the given stream contains contains UTF-16 big endian encoded text. A check is made for the correct byte order mark at the current stream position. The stream's original position is retained.

function IsUTF16BEStream(const Stm: Classes.TStream): Boolean;
begin
  Result := StreamHasWatermark(Stm, [$FE, $FF]);
end;

IsUTF16File

Checks if the given file is a UTF-16 encoded text file by examining the byte order mark at the start of the file. Both litlle- and big-endian encodings are accepted.

function IsUTF16File(const FileName: string): Boolean;
var
  Stm: Classes.TStream;
begin
  Stm := Classes.TFileStream.Create(
    FileName, SysUtils.fmOpenRead or SysUtils.fmShareDenyNone
  );
  try
    Result := IsUTF16Stream(Stm);
  finally
    Stm.Free;
  end;
end;

IsUTF16LEFile

Checks if the given file is a UTF-16 little-endian encoded text file by examining the byte order mark at the start of the file.

function IsUTF16LEFile(const FileName: string): Boolean;
var
  Stm: Classes.TStream;
begin
  Stm := Classes.TFileStream.Create(
    FileName, SysUtils.fmOpenRead or SysUtils.fmShareDenyNone
  );
  try
    Result := IsUTF16LEStream(Stm);
  finally
    Stm.Free;
  end;
end;

IsUTF16LEStream

Checks if the given stream contains contains UTF-16 little endian encoded text. A check is made for the correct byte order mark at the current stream position. The stream's original position is retained.

function IsUTF16LEStream(const Stm: Classes.TStream): Boolean;
begin
  Result := StreamHasWatermark(Stm, [$FF, $FE]);
end;

IsUTF16Stream

Checks if the given stream contains contains UTF-16 encoded text in either big- or little-endian format. A check is made for the correct byte order marks at the current stream position. The stream's original position is retained.

function IsUTF16Stream(const Stm: Classes.TStream): Boolean;
begin
  Result := StreamHasWatermark(Stm, [$FF, $FE])  // UTF-16 LE
    or StreamHasWatermark(Stm, [$FE, $FF])       // UTF-16 BE
end;

IsUTF7File

Checks if the given file is a UTF-7 encoded text file by examining the byte order mark at the start of the file.

function IsUTF7File(const FileName: string): Boolean;
var
  Stm: Classes.TStream;
begin
  Stm := Classes.TFileStream.Create(
    FileName, SysUtils.fmOpenRead or SysUtils.fmShareDenyNone
  );
  try
    Result := IsUTF7Stream(Stm);
  finally
    Stm.Free;
  end;
end;

IsUTF7Stream

Checks if the given stream contains UTF-7 encoded text. A check is made for the correct byte order marks at the current stream position. The stream's original position is retained.

function IsUTF7Stream(const Stm: Classes.TStream): Boolean;
begin
  Result := StreamHasWatermark(Stm, [$2B, $2F, $76, $38])
    or StreamHasWatermark(Stm, [$2B, $2F, $76, $39])
    or StreamHasWatermark(Stm, [$2B, $2F, $76, $2B])
    or StreamHasWatermark(Stm, [$2B, $2F, $76, $2F]);
end;

IsUTF8File

Checks if the given file is a UTF-8 encoded text file by examining the byte order mark at the start of the file.

function IsUTF8File(const FileName: string): Boolean;
var
  Stm: Classes.TStream;
begin
  Stm := Classes.TFileStream.Create(
    FileName, SysUtils.fmOpenRead or SysUtils.fmShareDenyNone
  );
  try
    Result := IsUTF8Stream(Stm);
  finally
    Stm.Free;
  end;
end;

IsUTF8Stream

Checks if the given stream contains UTF-8 encoded text. A check is made for the correct byte order mark at the current stream position. The stream's original position is retained.

function IsUTF8Stream(const Stm: Classes.TStream): Boolean;
begin
  Result := StreamHasWatermark(Stm, [$EF, $BB, $BF]);
end;

SafeFreeEncoding

Frees the given TEncoding object unless it is one of the standard encodings when it is left as-is. Returns True if the encoding was freed and False if not.

function SafeFreeEncoding(const Enc: SysUtils.TEncoding): Boolean;
begin
  if SysUtils.TEncoding.IsStandardEncoding(Enc) then
    Exit(False);
  Enc.Free;
  Result := True;
end;

View the whole database.

Go to the DelphiDabbler website.