ProgrammingBackend Developer

How is input-output stream handling organized in Perl? Explain the features of layers, file descriptor assignments, and provide examples of correct handling of binary and text files.

Pass interviews with Hintsage AI assistant

Answer.

In Perl, file handling is done by opening descriptors using open. Besides standard descriptors (STDIN, STDOUT, STDERR), you can create and manage your own.

Perl uses the concept of input-output layers (:encoding, :utf8, :raw, etc.) to correctly handle different types of files and encodings. By default, Perl can work in text mode (with newline conversion) or binary mode.

Example of opening a text file with explicit encoding:

open my $fh, '<:encoding(UTF-8)', 'file.txt' or die $!; while (my $line = <$fh>) { print $line; } close $fh;

Example of opening a binary file:

open my $fh, '<:raw', 'image.bin' or die $!; read($fh, my $data, -s 'image.bin'); close $fh;

Choosing the correct layer (:raw for binary, :encoding(NAME) for text) ensures proper reading and writing.


Trick Question.

If you open a file using the statement open FH, '<', $file and read binary data, will we always get correct results?

Answer: No! Without specifying :raw, Perl on some platforms will automatically convert newline characters (e.g., CRLF → LF on Windows). Always use the :raw mode for reading binary files:

open my $fh, '<:raw', 'file.bin';

Examples of real errors due to ignorance of the nuances of the topic.


Story

In one corporate project, developers worked with text logs, reading lines without specifying the encoding. As a result, logs in UTF-8 sometimes "broke" — the read file was corrupted when reading Cyrillic characters because Perl misinterpreted the bytes. The issue was fixed only after explicitly adding the :encoding(UTF-8) layer in the open call.


Story

On Windows, when copying binary files, data were read using open FH, '<', 'binfile.dat' and written without specifying the mode. The program "broke" images because in the newline stream CRLF was changed to LF, leading to invalid binary data. The :raw layer fixed the issue.


Story

In an external API, STDOUT output was required to be in UTF-8, but programmers used print without changing the write layer. The application sent text in the local character encoding, and Cyrillic appeared as "garbage" on the client side. After explicitly applying binmode STDOUT, ':encoding(UTF-8)', the problem was resolved.