Skip to content
Image with logo, providing a link to the home page
  • United Stated of America flag, representing the option for the English language.
  • Bandeira do Brasil, simbolizando a opção pelo idioma Português do Brasil.

Learn Programming: Files and Serialization (Marshalling)

Example of creating a text file in four programming languages: Python, Lua, GDScript and JavaScript.

Image credits: Image created by the author using the program Spectacle.

Requirements

In the introduction to development environments, I have mentioned Python, Lua and JavaScript as good choices of programming languages for beginners. Later, I have commented about GDScript as an option for people who want to program digital games or simulations. For the introductory programming activities, you will need, at least, a development environment configured for one of the previous languages.

If you wish to try programming without configuring an environment, you can use of the online editors that I have created:

However, they do not provide all features offered by interpreters for the languages. Thus, sooner or later, you will need to set up a development environment. If you need to configure one, you can refer to the following resources.

Thus, if you have an Integrated Development Environment (IDE), or a combination of text editor and an interpreter, you are ready to start. The following example assumes that you know how to run code in your chosen language, as presented in the configuration pages.

If you want to use another language, the introduction provides links for configure development environments for the C, C++, Java, LISP, Prolog, and SQL (with SQLite) languages. In many languages, it suffices to follow the models from the experimentation section to modify syntax, commands and functions from the code blocks. C and C++ are exceptions, for they require pointers access the memory.

Secondary Memory, File Systems and Files

Some of the very first topics of this material have defined file systems. The first of the topics commenting about file systems defined files, directories (folders) and paths (paths can be relative or absolute). To use files in programming, it is important to understand the concepts of paths, relative paths and absolute paths. In programming, it is often preferable to use files with relative paths, because they make it easier to use files with a very same structure of directories in different machines. Thus, if you do not know what are paths or a work directory (or work dir), it is recommended to read the mentioned topic to understand this one.

The second topic about file systems described file managers, programs that can perform tasks to create, organize and manipulate files and directories. In particular, a file manager will be necessary in this topic, as you will need to access the content that you will create.

Programming languages also provide features and abstractions to manipulate files. Files are abstractions of the secondary memory (the persistent one) of a machine. One can create a new (empty) file, write data on it, and save it in the disk. A save file can be loaded and read. It can also be modified, when one perform new write operations. With the previous operations, it is possible to save data in a session of the program to load it at any time in the future on which the stored data is necessary.

However, files can be manipulated differently. Programming languages normally provide two types of files:

  1. Text files;
  2. Binary files.

Text files are files that story data encoded as text. For instance, all source code files that are created for JavaScript, Lua, Python and GDScript are text files. You can create, open and modify text files using a text editors. Text file are convenient as they are usable and readable by human beings, though they require more bytes for storage (for instance, the integer number 01234 is saved as the text "01234", which requires, for instance, 4 bytes to be written in ASCII or UTF-8), processing time, and allow sequential access (that is, it is read byte per byte).

Binary files are files that store data encoded by the type they represent. For instance, the integer number 01234 would be saved as the sequence of bytes 00000100 11010010 (which can be written using 2 bytes). To read and edit a binary file, one uses a hexadecimal editor (or hex editor) instead of a text editor. Binary files are usually smaller (size in bytes) than text files, faster to write and read, and allow random access (that is, reading a specific position of the file), though they are not accessible for human beings. Most image, video and audio files are typical examples of binary files. Although it is possible to open them in a text editor, the result will not be what one expects. In fact, an efficient manipulation and use of binary files often requires specialized programs. For instance, an image viewer or editor, or an audio or video player or editor.

Regardless of the case, every program that retains information among sessions use files. Even implementation of database management systems (DBMSs) use files internally. Thus, it is important learning to use files.

Nevertheless, a warning is necessary before starting.

IMPORTANT NOTICE

Before running the examples in this topic, it is important to know that file operations can lead to data loss. In particular, the creation of a new file can erase the contents of an existing file in the chosen directory with the same name. Similarly, a modification of an existing file is a persistent operation, which means that changes will be permanent. Therefore, care is required when creating new files or modifying existing ones. Before opening a file, verify if you do not have a file with same name and directory in the chosen directory (to avoid data loss).

The author of this page is not responsible for eventual data loss (as it will be mentioned soon, he will even try to minimize the risks). The read is responsible for the files stored in her/his system. Important files must always have back-up copies and/or versioning.

The warning can be scary, though it is important. As files use secondary memory, the results of the operations are persistent. Thus, it is important to take proper care to avoid undesired data loss.

To minimize the changes of conflicts of names, all created files will have the prefix franco-garcia-. For instance, franco-garcia-my_file.txt instead of my_file.txt or my file.txt. Every file will be created in the work directory, which will probably be the same on which the source code file is stored and/or the code is interpreted. Unless your name is also Franco Garcia and/or you create files using the same prefix and convention, the chances of having files with the same name will be low. However, they exist and require care on your part. If you need to change the names, the path (name and directory) of the file to be created will be defined a variable at the beginning of the code of each program (though it can be located after definitions of subroutines, constants and records), called file_path (in the case of the copy example, there will also be a variable called copy_path). Preferably, avoid a unique name, to avoid the risk of overwriting an existing file.

A good practice for introductory activities is creating a new directory in your computer to use as the work directory for the programs. All files that you must create or read must belong to this directory. Another benefit of creating a directory is that you will be sure that the chosen place allows writing operation. Operating systems can restrict the access to files and directory, as well as read and write operations, for certain combinations of user account and directory. This is called of access permissions. Whenever one works with files, it is necessary that the current account has permissions to read the directory and the file (if she/he wishes to read a file), and/or writing permission for the directory and the file (if she/he wishes to create a new file, write to it or modify an existing file). Permission errors are common, specially in shared machines. Thus, if your code is correct, though a program does not work, check if you have sufficient permissions to the path that was defined for the file.

Text Files

Text files can be easier to operate than binary files. This is due to some reasons:

  1. One can inspect the contents of a text file using a text editor;
  2. One can modify the contents of a text file using a text editor;
  3. There are multiple command line tools to manipulate text files. In fact, systems based on Unix (such as Linux) primarily use text files for configuration and data exchange for command line programs;
  4. In programming, operations with text files are similar to using subroutines or commands such as print() and input().

The last reason can appear strange, though it is applicable to many programming languages. Many programming languages abstract console (terminal) operations using files. In such languages, there exists three files with special names and purposes:

  1. Standard output, better known as stdout;
  2. Standard input, better known as stdin;
  3. Standard error output, better known as stderr.

The use of print() or console.log() redirects the written message to stdout, which is, then, written in a terminal. The use of input() or io.read() and similar use stdin as an input buffer (memory), used to temporarily store values read from a keyboard. This is the reason why some input subroutines or command (such as io.read() in Lua) provide incorrect values after reading certain data types -- remainders of data that were read may be stored in the stdin file, which are them provided for the next read.

The standard error output has not yet been covered in previous topics, though, hereafter, you can start using it in your programs. For instance:

console.error("Error message")
// JavaScript also provides a warning message:
console.warn("Warning message")
import sys

print("Error message", file=sys.stderr)

# Python also provides a warning message:
import warnings
warnings.warn("Warning message")
io.stderr:write("Error message")
error("Error message")
extends Node

func _ready():
    printerr("Error message")
    # JavaScript also provides a warning message:
    push_warning("Warning message")

Documentations (the versions for warning may use another output file -- for instance, stdout or other):

The previous examples illustrate what one should to write in another file: the file must be designed somehow to inform in which file the output should happen. The same principle applies to files created by programmers. The difference is that, in this case, the person must perform, at minimum, two additional tasks: open (or create a file) before using a file and close the opened file when it is no longer necessary.

Basic Operations Using Text Files

There are three main ways of using a file:

  1. Use a file to write data;
  2. Use a file to read data;
  3. Use a file to read and write data.

In the following examples, the first subsection will create a text file with five lines.

Olá, meu nome é Franco.
Olá, meu nome é Franco.
1 2 3
-1.23

The two initial lines are phrases. The third line has three integer numbers (which will be stored as text). The fourth line stores a real number, which will be stored as text. The fifth line is an empty line. Unfortunately, it does not show in the code formatted in HTML.

Next, a program will be created to read the written content.

Every operation with files follow a same model (pattern):

  1. Opening or creation of a file;
  2. Operations using the file;
  3. Closure of the file.

A file can be closed when it is no longer necessary. This can be performed immediately after finishing all read and/or write operations, or before the program ends. However, one must use a file before closing it; this is an error. To use a closed file, one should open it again before using it, repeating the process.

Furthermore, it is important to close the file. Some implementations may do it automatically; personally, I prefer do it manually. It is important to close a file to ensure that all data will be saved. For better performance, some implementations group write operations before saving them in a file. The closure forces a memory dump (flush). There exists subroutines to perform a flush when desired; they must only be used for output files (commonly called output streams).

Text files are also commonly used for logging tasks, usually for debugging programs. One can store messages or values of interest in a file to inspect them in case of problems or crashing in a program. Some programming languages provide log implementations in the standard library. If they do not exist, it is common to find libraries that provide the feature. Otherwise, is suffices to create a text file and keep it open when the programming is being used. Whenever one wants to log a message, she/he just needs to use it. Before the program ends, the file must be closed.

Writing a Text File

To write in a text file, one must convert the desired data to a string. Some implementations perform type conversion automatically; if one does not, it suffices to convert the data before using it. For better performance, it is usually better to store the entire contents in a string to write them at once. A variable used for such purpose is called a buffer.

Although it is possible to use JavaScript outside browsers, the following example consider using the language in a browser. Thus, it will be slightly different from the versions written for Python, Lua and GDScript, which are commonly used outside a browser.

Furthermore, the traditional Hello, my name is Franco will be written in the Portuguese version Olá, meu nome é Franco. This will allow to watch the encoded accents in some cases (such as in binary files, later in this topic). You can use the English version or another text, if you wish.

let file_path = "franco-garcia-written_text_file.txt"
let contents = "Olá, meu nome é Franco.\n"
contents += "Olá, meu nome é Franco.\n"
contents += "1 2 3\n"
contents += "-1.23\n"

let file = new File([contents], file_path, {type: "text/plain"})
console.log("File created successfully.")

let download_link = document.createElement("a")
download_link.target = "_blank"
download_link.href = URL.createObjectURL(file)
download_link.download = file.name
if (confirm("Do you want to download the file '" + file.name + "'?")) {
    download_link.click()
    // In this case, revokeObjectURL() can be used both for confirmation
    // and for cancelling.
    URL.revokeObjectURL(download_link.href)
}
import io
import sys

try:
    file_path = "franco-garcia-written_text_file.txt"
    file = open(file_path, "w")

    file.writelines([
        "Olá, meu nome é Franco.\n",
        "Olá, meu nome é Franco.\n",
        "1 2 3\n",
        "-1.23"
    ])

    file.close()

    print("File created successfully.")
except IOError as exception:
    print("Error when trying to create the text file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to create the text file.", file=sys.stderr)
    print(exception)
-- <https://en.cppreference.com/w/c/program/EXIT_status>
local EXIT_FAILURE = 1

local file_path = "franco-garcia-written_text_file.txt"
local file = io.open(file_path, "w")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to create the text file.")
    os.exit(EXIT_FAILURE)
end

file:write("Olá, meu nome é Franco.\n")
file:write("Olá, meu nome é Franco.\n")
file:write("1 2 3\n")
file:write("-1.23\n")

io.close(file)

print("File created successfully.")
extends Node

# <https://en.cppreference.com/w/c/program/EXIT_status>
const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-written_text_file.txt"
    var file = File.new()
    if (file.open(file_path, File.WRITE) != OK):
        printerr("Error when trying to create the text file.")
        get_tree().quit(EXIT_FAILURE)

    file.store_string("Olá, meu nome é Franco.\n")
    file.store_string("Olá, meu nome é Franco.\n")
    file.store_string("1 2 3\n")
    file.store_string("-1.23\n")

    file.close()

    print("File created successfully.")

The execution of the program in Python, Lua and GDScript will generate a text file named franco-garcia-written_text_file.txt in the directory that the program is run (that is, in the work directory). The execution of the program in JavaScript will create a text file that can be saved in the machine. All files will have the same content; it is possible to use a text editor to open the file and read the contents. It is also possible to modify the resulting file in a text editor. However, the examples of reading the file will assume that existence of the data created in the program; thus, if you modify the values, run the programs again to generate a new file that is equivalent to the original one.

The examples in Python, Lua and GDScript provide the classic way of working with file.

The Python version opens a new file using open() (documentation; it is possible to use a named parameter to choose the encoding; for instance encoding="utf-8"). The parameter "w" (write) designates the mode to open the file; "w" means to open a text file in write mode, creating a new empty file (if it does not exit) or erasing all the existing contents (if the file exists). Next, the file is manipulated using the class TextIOBase (documentation) and TextIOWrapper (documentation). The method writelines() (documentation) allows writing an array of strings in the file. After fishing the use, a call to close() (documentation) closes the file.

The Lua version uses io.open() (documentation) to open the file; the parameter "w" works similarly to Python's. The writing uses io.write() (documentation). After finishing the use, io.close() (documentation) can close the file. As io.open(), io.write() and io.close() use, respectively, file.open(), file.write() and file.close(), it is also worth consulting the documentation for those methods. In case of error, os.exit() (documentation) allows finishing the program prematurely. EXIT_FAILURE is a constant defined in C to indicate that a program ended with an error (documentation); as it does not exist in Lua, a constant was defined with the value 1 (the value can vary among operating systems and platforms; 1 is a common value for desktop architectures). The function debug.traceback() (documentation) provides data about the call stack in the moment of an error.

The version in GDScript uses the class File (documentation) for the operations. The method open() (documentation) opens the file; the parameter File.WRITE (documentation) designates the mode for the operation, which matches "w". The method store_string() (documentation) can write a string at the file. After using the file, close() (documentation) can close it. The constant EXIT_FAILURE is defined as in Lua, following the same considerations. To end the program, one uses get_tree() (documentation), then quit() (documentation).

In JavaScript for back-end (for instance, using Node.js), it is possible to use files like in Python, Lua and GDScript. For front-end (in other others, in browsers), the use of files is slightly different, as illustrated in the example. Instead of creating a file in the system, one temporarily creates a file in the browser for download. This requires that the whole text to be written is ready (in this case, it is stored in contents). Next, File() (documentation) is used to create the file. The chosen type "text/plain" is a media type or Multipurpose Internet Mail Extensions types (MIME type; documentation) designation. With the created file, a link in the page is created using document.createElement() (documentation) which is filled with the file's data: target in target ("_blank" means new window or tab), file address in href (generated with URL.createObjectURL(); documentation), and the filename provided by the download in download. Finally, a confirmation dialog is created with confirm() (documentation). If the user confirms the dialog, link is clicked programatically by click() (documentation). Next, the download link is removed with URL.revokeObjectURL() (documentation). The file will be transferred from the browser to the machine, as if it was a real download. However, as the code is local, one does not need Internet connection to acquire the resulting file.

Besides the mode "w" for writing, another usual choice is called append ("a"), which works similarly to "w" if the file does not exist. However, if a file with same path exists, the append mode opens the existing file at its end to add new data (instead of deleting the existing content). This is very useful when one needs to add new content to the end of a file, such as in a log. The inclusion of new data is simple at the end of a text file, though is more complicated (and less efficient) anywhere else.

There also exists the mode "rw+" or "r+w", which opens a file for reading and writing. The mode is useful for editing existing files, because it preserves the stored contents (if the file already exists) or create a new empty value (if the file does not exist). Unlike append, the file can be read and written at any position (instead of only being written at the end). However, writing before the end of the file requires moving the existing content forward. As it will be commented as a technique, it is usually easier to create a new file with the modified content and overwrite the original file (or keep it as a back-up).

Closing a File Automatically When a Scope Ends

In Python and Lua (since version 5.4), there is an alternative way to close a file. The alternative automatically closes a file when its variable reaches the end of the scope. The following example illustrates the technique using a text file; however, one could modify the calls to open() in Python or io.open() in Lua to use it with any other file types or operations.

import io
import sys

file_path = "franco-garcia-written_scoped_text_file.txt"
with open(file_path, "w") as file:
    file.writelines([
        "Olá, meu nome é Franco.\n",
        "Olá, meu nome é Franco.\n",
        "1 2 3\n",
        "-1.23"
    ])

    print("File created successfully.")
    # End of the scope; the file is closed automatically.

print("File closed.")
local EXIT_FAILURE = 1

local file_path = "franco-garcia-written_scoped_text_file.txt"
do
    local file <close> = io.open(file_path, "w")
    if (file == nil) then
        print(debug.traceback())

        error("Error when trying to create the text file.")
        os.exit(EXIT_FAILURE)
    end

    file:write("Olá, meu nome é Franco.\n")
    file:write("Olá, meu nome é Franco.\n")
    file:write("1 2 3\n")
    file:write("-1.23\n")

    print("File created successfully.")
    -- End of the scope; the file is closed automatically.
end

print("File closed.")

In Lua, one can force the creation of a new scope using the pair do and and (this might not be necessary if the code is defined inside a subroutine, that will have its own local scope). Next, she/he should use the <close> modifier to designate that the file should be closed at the end of the scope (documentation). It is also possible to define custom <close> operations for types defined as records in tables.

In Python, the with reserved word should be used to open the file (documentation). It should be noted that exceptions can still occur (an example of exception handling has been omitted in the code).

The alternative way closes the file when the scope ends. This has two advantages. The first is that there is no risk of forgetting to close the file. The second is that the file is still closed automatically in the case of problems (such as exceptions). Thus, automatic closure can be safer than the traditional way of closing files. In C++, the generalization of this technique is called resource acquisition is initialization (RAII).

Reading a Text File

There are three main ways of reading a text file:

  1. Read the entire file as a single string;
  2. Split the content of the file in an array of values separated by a delimiter. For instance, reading a file line per line;
  3. Extract data from the text file. In the third way, the goal is extracting data and converting them to variables with more appropriate types.

There are other ways, such as reading the file character by character. The end of the text file has a special value called end of file (EOF). For ease of processing text files, it is common to end with an empty line before the OEF (although this is not necessary).

Reading the Whole Text File

It is usually simple to read a whole text file, although it can require more primary memory than other ways of reading file. Another advantage of it is that storing the whole content of the file into a variable is useful as an optimization, because it avoids reading the secondary memory multiple times (which is slower than the primary memory). After read all the content, the variable can be processed as any other string.

The JavaScript version cannot be used directly, because it requires an HTML page to match it (the HTML code is provided in the sequence of this section). It is also more complex than the others, due to some restrictions imposed by browsers. On the other hand, the implementation allows showing the read content in the browser.

// This file must be saved in a file called "script.js".
// It will be processed by an HTML page with code to send the
// text file by a form.

// <https://francogarcia.com/en/blog/development-environments-javascript/>
function add_element(value, element_name = "p") {
    const parent = document.getElementById("contents")
    const new_element = document.createElement(element_name)
    new_element.innerHTML = value
    parent.appendChild(new_element)
}

function read_file(text_file) {
    // console.log(text_file)
    if (!text_file) {
        return
    } else if (!(text_file instanceof File)) {
        return
    }

    let file_reader = new FileReader()
    file_reader.onload = function(event) {
        let contents = event.target.result
        console.log(contents)

        // Replaces all occurrences of \n by a line break <br/>
        // to show the text in the browser.
        // Optionally, if the system define line breaks as \r\n,
        // the regular expresson replaces both occurrences for the tag.
        add_element(contents.replace(new RegExp("\\r?\\n", "g"), "<br/>"))
    }
    file_reader.readAsText(text_file)

    // Disallows the submission of the form, allowing the visualization of the
    // result of add_element() in the same page.
    return false
}
import io
import sys

try:
    file_path = "franco-garcia-written_text_file.txt"
    file = open(file_path, "r")
    contents = file.read()
    file.close()

    print("File read successfully.")
    print(contents)
except IOError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
local EXIT_FAILURE = 1

local file_path = "franco-garcia-written_text_file.txt"
local file = io.open(file_path, "r")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to read the text file.")
    os.exit(EXIT_FAILURE)
end

local contents = file:read("*all")
io.close(file)

print("File read successfully.")
print(contents)
extends Node

const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-written_text_file.txt"
    var file = File.new()
    if (file.open(file_path, File.READ) != OK):
        printerr("Error when trying to read the text file.")
        get_tree().quit(EXIT_FAILURE)

    var contents = file.get_as_text()
    file.close()

    print("File read successfully.")
    print(contents)

The structure of a code to read a file is similar to the one used to write a file, because it also starts by opening a file and end by closing it. An attempt to read a nonexistent file would result in an error. The program will try reading the file that has been created in the previous program, which created a text file. If it has been removed for any reason, it must be recreated.

In Python, the reading operation used the parameter "r" (read). Next, the use of the method read() (documentation) without parameters allows reading the entire file; the provision of a parameter allows reading parts of the file (the parameter is the number of bytes that should be read). Alternatively, one can use readall() (documentation) to read the whole file as well.

In Lua, the parameter "r" is also used to specify the read mode. The use of file:read() (documentation) allows reading a file the same was that io.read() (documentation) reads the standard input. Thus, one can read a number using "*number", a line with "*line", the whole file with "*all" or a number of bytes by providing an integer number.

In GDScript, the method get_as_text() (documentation) read as entire file encoded as UTF-8. To read parts of the file, one must use the other provided methods.

The JavaScript version will require an HTML page with a form to send the file, because, for security reasons, a file submission in a browser requires that a user starts the interaction with a click (or with any other explicit interaction). The page is an adaptation of the exampled provided in the JavaScript development environment setup. The procedure add_element() has also been previously defined in the example from the environment configuration. In this topic, it is used to show the contents of the file in the browser.

<!DOCTYPE html>
<html lang="pt-BR">

  <head>
    <meta charset="utf-8">
    <title>Text File Read</title>
    <meta name="author" content="Franco Eusébio Garcia">
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>

  <body>
    <header>
      <h1>File Read</h1>
    </header>

    <main>
      <!-- Form to send the text file. -->
      <form method="post"
            enctype="multipart/form-data"
            onsubmit="return read_file(files.files[0])">
        <label for="files">Choose a text file:</label>
        <input id="files"
               name="files"
               type="file"
               accept="text/plain"/>
        <input type="submit"/>
      </form>

      <div id="contents">
      </div>

      <!-- The name of the JavaScript file must match the one defined below. -->
      <script src="./script.js"></script>
    </main>
  </body>

</html>

In the HTML page, the tag <form> (documentation) creates a form. The form has an <input> (documentation) tag, configure as a file picker. To do this, the type="file" (documentation) is used. The form has a button to submit the chosen file, defined by <input type="submit"/>. The processing will be performed by the function read_file(), as defined on onsubmit in the <form> definition. The function read read_file() must be implemented in a file script.js, stored in the same directory of the HTML file.

In this case, the form performs local processing, which means t hat no data is sent to the Internet. In the JavaScript code, instanceof (documentation) checks if the instance is a variable of the type File. If the parameter represents a file, a FileReader (documentation) reads the file. The read is performed in the call readAsText() (documentation). When it ends, the implementation runs the code defined on onload (documentation), which should be defined before the call to readAsText(). In onload, an anonymous (lambda) function was defined to write the contents in the console (terminal) and also on the page displayed by the browser, using add_element(). The regular expressions replace line breaks in the string by line break tags used by the browser, to render the lines in the text correctly.

In this case, onload defines a callback function, called by the FileReader class implementation to process the read data in an appropriate time.

Reading a Delimited Text File (Line By Line)

When the quantity of free primary memory (RAM) is sufficient to store the entire contents of a file, it can be read at once. Although this is common in modern machines (as desktops or recent mobile devices), there are machines with limited quantities of memory (such as embedded devices). In other cases, it can be desirable to operate with smaller parts of the file in primary memory.

A second common approach to read text files is reading a part of the file until a next delimiter. Normally, this delimiter is a line break. In other words, one can read a line of the file at a time.

File implementations commonly provide a subroutine to read a line (JavaScript for browsers is an exception). With a repetition structure, a program can read each line of a file until it ends.

import io
import sys

try:
    file_path = "franco-garcia-written_text_file.txt"
    file = open(file_path, "r")

    text_line = file.readline()
    while (text_line):
        print(text_line, end="")
        text_line = file.readline()

    file.close()
    print("File read successfully.")
except IOError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
local EXIT_FAILURE = 1

local file_path = "franco-garcia-written_text_file.txt"
local file = io.open(file_path, "r")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to read the text file.")
    os.exit(EXIT_FAILURE)
end

local text_line = file:read("*line")
while (text_line) do
    print(text_line)
    text_line = file:read("*line")
end

io.close(file)

print("File read successfully.")

-- Alternative:
for text_line in io.lines(file_path) do
    print(text_line)
end
extends Node

const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-written_text_file.txt"
    var file = File.new()
    if (file.open(file_path, File.READ) != OK):
        printerr("Error when trying to read the text file.")
        get_tree().quit(EXIT_FAILURE)

    var text_line = file.get_line()
    while (text_line):
        print(text_line)
        text_line = file.get_line()

    print("File read successfully.")

Python provides the method readline() (documentation) to read the next line of a file. Lua allows using io.read() or file:read() with the parameter "*line" as if it was the standard input; the language also provides io.lines() (documentation) for a line iterator. GDScript provides get_line() (documentation) to read the next line.

In all cases, the read value can be used as the condition for a while loop. While the value is a valid or non-null string, the code will be repeated.

If one wishes to count the number of lines in file, she/he can instance an integer type variable as a counter and increment it every time a new line is read successfully.

Reading a Text File and Extracting Data

When one knows the contents of a file, or if the file follows a well-defined and regular format, she/he can extract stored data to process it with greater granularity. To do this, the data from strings can be converted for more suitable types, such as integer numbers, real numbers or logic values. Thus, it is possible, for instance, to perform arithmetic, relational and logic operations with the extract values.

For instance, the file that was created in the previous sections has two lines of text, followed by a line with three integer numbers, followed by a line with a real number, followed by an empty line, and the end of the file. As the format is known, it can be read at once (or line by line) and the data can be extracted.

There are two main approaches to read a file for extraction.

  1. The whole file can be read at once and split in an array using delimiters. Next, one processes each value of the array.
  2. The text file can be imaged as if it represented all inputs provided by an end-user while she/he was using the program. In this case, the file can be interpreted as it was the origin of values for input command such as input(), io.read() or prompt().

In some programming languages (for instance, JavaScript for browsers), the first case will be imposed, as the whole file will be read. In other languages, one can choose the approach that makes it simpler to solve the problem.

// This file must be saved in a file called "script.js".
// It will be processed by an HTML page with code to send the
// text file by a form.

// <https://francogarcia.com/en/blog/development-environments-javascript/>
function add_element(value, element_name = "p") {
    const parent = document.getElementById("contents")
    const new_element = document.createElement(element_name)
    new_element.innerHTML = value
    parent.appendChild(new_element)
}

function read_file(text_file) {
    if (!text_file) {
        return
    } else if (!(text_file instanceof File)) {
        return
    }

    let file_reader = new FileReader()
    file_reader.onload = function(event) {
        // Approach 1: read the whole file and use split.
        console.log("Approach 1")
        let contents = event.target.result

        let lines = contents.split("\n")
        console.log(lines)

        let first_phrase = lines[0]
        let second_phrase = lines[1]
        let integer_numbers = []
        let integer_numbers_sum = 0
        for (let number_text of lines[2].split(" ")) {
            let number = parseInt(number_text)
            integer_numbers.push(number)
            integer_numbers_sum += number
        }

        let real_number = parseFloat(lines[3])

        console.log(first_phrase)
        console.log(second_phrase)
        console.log(integer_numbers, integer_numbers_sum)
        console.log(real_number, "Positive number?", real_number > 0)

        add_element(first_phrase)
        add_element(second_phrase)
        add_element("[" + integer_numbers + "] " + integer_numbers_sum)
        add_element(real_number + " " +  "Positive number?" + " " + (real_number > 0))
    }
    file_reader.readAsText(text_file)

    // Disallows the submission of the form, allowing the visualization of the
    // result of add_element() in the same page.
    return false
}
import io
import sys

try:
    file_path = "franco-garcia-written_text_file.txt"
    file = open(file_path, "r")

    # Approach 1: read the whole file and use split.
    print("Approach 1")
    contents = file.read()

    lines = contents.split("\n")
    print(lines)

    first_phrase = lines[0]
    second_phrase = lines[1]
    integer_numbers = []
    integer_numbers_sum = 0
    for number_text in lines[2].split():
        number = int(number_text)
        integer_numbers.append(number)
        integer_numbers_sum += number

    real_number = float(lines[3])

    print(first_phrase)
    print(second_phrase)
    print(integer_numbers, integer_numbers_sum)
    print(real_number, "Positive number?", real_number > 0)

    # Approach 2: read the file was it was the input from a user.
    print("\nApproach 2")
    file.seek(0) # ou file.seek(0, 0) ou file.seek(0, io.SEEK_SET)

    first_phrase = file.readline().rstrip()
    second_phrase = file.readline().rstrip()
    integer_numbers = []
    integer_numbers_sum = 0
    for number_text in file.readline().split():
        number = int(number_text)
        integer_numbers.append(number)
        integer_numbers_sum += number

    real_number = float(file.readline())

    print(first_phrase)
    print(second_phrase)
    print(integer_numbers, integer_numbers_sum)
    print(real_number, "Positive number?", real_number > 0)

    file.close()

except IOError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
function write_indentation(level)
    for indentation = 1, level do
        io.write("  ")
    end
end

function write_table(a_table, level)
    level = level or 1
    if (type(a_table) == "table") then
        io.write("{\n")
        for key, value in parentrs(a_table) do
            write_indentation(level)
            io.write(tostring(key) .. ": ")
            write_table(value, level + 1)
            io.write("\n")
        end
        write_indentation(level - 1)
        io.write("},")
    else
        local quotes = ""
        if (type(a_table) == "string") then
            quotes = "\""
        end
        io.write(quotes .. tostring(a_table) .. quotes .. ",")
    end
end

function split(a_string, delimiter)
    delimiter = delimiter or " "
    local result = {}
    local size = #a_string
    local begin_at = 1
    while (begin_at <= size) do
        local end_at, next = string.find(a_string, delimiter, begin_at, true)
        if (end_at ~= nil) then
            table.insert(result, string.sub(a_string, begin_at, end_at - 1))
            begin_at = next + 1
        else
            table.insert(result, string.sub(a_string, begin_at))
            begin_at = size + 1
        end
    end

    if (string.sub(a_string, -#delimiter) == delimiter) then
        table.insert(result, "")
    end

    return result
end

-- The program starts here.
local EXIT_FAILURE = 1
local file_path = "franco-garcia-written_text_file.txt"
local file = io.open(file_path, "r")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to read the text file.")
    os.exit(EXIT_FAILURE)
end

-- Approach 1: read the whole file and use split.
print("Approach 1")
local contents = file:read("*all")

local lines = split(contents, "\n")
write_table(lines)
print()

local first_phrase = lines[1]
local second_phrase = lines[2]
local integer_numbers = {}
local integer_numbers_sum = 0
for _, number_text in iparentrs(split(lines[3], " ")) do
    local number = tonumber(number_text)
    table.insert(integer_numbers, number)
    integer_numbers_sum = integer_numbers_sum + number
end

local real_number = tonumber(lines[4])

print(first_phrase)
print(second_phrase)
write_table(integer_numbers)
print(" " .. integer_numbers_sum)
print(real_number, "Positive number?", real_number > 0)

-- Approach 2: read the file was it was the input from a user.
print("\nApproach 2")
file:seek("set")

first_phrase = file:read("*line")
second_phrase = file:read("*line")
integer_numbers = {}
integer_numbers_sum = 0
for index_number = 1, 3 do
    local number = file:read("*number")
    table.insert(integer_numbers, number)
    integer_numbers_sum = integer_numbers_sum + number
end

real_number = tonumber(file:read("*number"))

print(first_phrase)
print(second_phrase)
write_table(integer_numbers)
print(" " .. integer_numbers_sum)
print(real_number, "Positive number?", real_number > 0)

io.close(file)
extends Node

const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-written_text_file.txt"
    var file = File.new()
    if (file.open(file_path, File.READ) != OK):
        printerr("Error when trying to read the text file.")
        get_tree().quit(EXIT_FAILURE)

    # Approach 1: read the whole file and use split.
    print("Approach 1")
    var contents = file.get_as_text()

    var lines = contents.split("\n")
    print(lines)

    var first_phrase = lines[0]
    var second_phrase = lines[1]
    var integer_numbers = []
    var integer_numbers_sum = 0
    for number_text in lines[2].split(" "):
        var number = int(number_text)
        integer_numbers.append(number)
        integer_numbers_sum += number

    var real_number = float(lines[3])

    print(first_phrase)
    print(second_phrase)
    printt(integer_numbers, integer_numbers_sum)
    printt(real_number, "Positive number?", real_number > 0)

    # Approach 2: read the file was it was the input from a user.
    print("\nApproach 2")
    file.seek(0)

    first_phrase = file.get_line()
    second_phrase = file.get_line()
    integer_numbers = []
    integer_numbers_sum = 0
    for number_text in file.get_line().split(" "):
        var number = int(number_text)
        integer_numbers.append(number)
        integer_numbers_sum += number

    real_number = float(file.get_line())

    print(first_phrase)
    print(second_phrase)
    printt(integer_numbers, integer_numbers_sum)
    printt(real_number, "Positive number?", real_number > 0)

    file.close()

The example in JavaScript requires the same HTML page that has been previously used to submit a text file for processing. For the use in browsers, only the first approach is possible, as the while file will be read. In the first approach, the content is split into an array of lines. Next, each line is processed according to the type data type(s) that is(are) stored, to extract the values. For an example of using the converted values, the integer numbers were added, and the real number was compared.

In Python, Lua and GDScript, both approaches are valid. The example in Lua is slightly longer because it uses previously defined subroutines to write and process values. The first approach words similarly to the JavaScript's one; thus, it will not be commented again. For the second approach, in each of the languages:

  1. A call to seek() changes the position in the file. Subroutines such as seek() allow changing read and/or write position in the file, to access or modify values at arbitrary positions. In text files, it is common to use an operation to return to the start of the file (set position) or advance to its final (end position). It is also possible to advance or go back according to the current position (cur or current position), though such movements are not always possible in text files.

    The values used in each programming language for skip operations often vary. Thus, it is important to consult the documentation. For Python: documentation; for Lua: documentation; for GDScript: documentation.

    In the case of the example, the implementation returns to the beginning of the file (seek set), with zero offset. In practice, this allows avoiding closing and reopening the file to read it from the beginning again.

  2. Next, each line is processed according to its value. The lines are read individually. In Python, as readline() keep the line break at the end, a call to rstrip() (documentation) can remove it.

    In Lua, one can read the values in the same way she/he would get user input from the standard input (stdin).

Data extraction in text file allows more sophisticated content manipulation. However, it is not always possible or simple to correctly extract data from any text file. For instance, if the contents of the file used in the example were changed, it would be necessary to rewrite the data extraction code. Thus, a better solution needs to be generic enough to accommodate changes scenarios. To do this, a structure is required to define a pattern for the stored data.

Structured Text Files

The adoption of structured text file formats can make it significantly easier to extract data from text files. The next subsections describe some popular formats and markup languages to store and exchange data. The acronym of each format is typically used as the extension for text file stored in the format.

Although some formats can be easily manipulated as strings, ideally one should use (or create) a high quality library for professional use any of formats.

Furthermore, there are tools to convert and support the use of structured text. Two interesting examples are Data-Selector (Dasel) and a list of tools available in this repository.

Comma Separated Values (CSV)

A popular format used by mathematical programs and spreadsheets is called Comma Separated Values (CSV). The name the format is its own specification; values are separated by commas.

Olá,meu,nome,é,Franco,Tudo bem?
1,2,3,4,5,6
1.1,2.2,3.3,4.4,5.5,6.6

In English, the file could be translated as:

Hello,my,name,is,Franco,How are you?
1,2,3,4,5,6
1.1,2.2,3.3,4.4,5.5,6.6

If a value must be stored with a command, it can be defined between double quotes. However, the convention can vary according to the implementation.

For the official specification of the format, one can consult the Request for Comment (RFC) 4180.

Tab Separated Values (TSV)

A variation of the CSV format consists in use of tabulations (tabs) instead of command to separate values. In programming languages strings, tabulations are normally represented by a \t. The resulting format is called Tab Separated Values (TSV)j.

Olá	meu	nome	é	Franco	Tudo bem?
1	2	3	4	5	6
1.1	2.2	3.3	4.4	5.5	6.6

When one uses TSV, it is important configuring her/his text editor to insert tabulations. Text editors for programming can be configured to replace a tabulation by a certain number of spaces. For the TSV format, it is important that tabulations are, indeed, real tabulation characters.

The specification of the format is simple; it is available at this page of the Internet Assigned Numbers Authority (IANA).

Extensible Markup Language (XML)

Extensible Markup Language (XML) is one of the pioneer formats for data exchange. The specification of the format is available at the official page of the format.

The XML format resembles HTML, though a programmer can choose the names of the tags. The only requirement is that the name of the starting tag must match the name of the one closes it. A tag can be defined as <TagName attribute="value">Stored value</TagName>. If it does not have a value in-between, it can be written as <TagName attribute="value"></TagName>, or, simply, <TagName attribute="value"/>.

<?xml version="1.0" encoding="UTF-8"?>
<!-- This is a comment. -->
<Valores>
  <Texts>
    <Text>Olá</Text>
    <Text>meu</Text>
    <Text>nome</Text>
    <Text>é</Text>
    <Text>Franco</Text>
    <Text>Tudo bem?</Text>
  </Texts>
  <IntegerNumberss>
    <IntegerNumber>1</IntegerNumber>
    <IntegerNumber>2</IntegerNumber>
    <IntegerNumber>3</IntegerNumber>
    <IntegerNumber>4</IntegerNumber>
    <IntegerNumber>5</IntegerNumber>
    <IntegerNumber>6</IntegerNumber>
  <IntegerNumberss>
  <RealNumbers>
    <RealNumber value="1.1"/>
    <RealNumber value="2.2"/>
    <RealNumber value="3.3"/>
    <RealNumber value="4.4"/>
    <RealNumber value="5.5"/>
    <RealNumber value="6.6"/>
  <RealNumbers>
</Valores>

Values can be store in-between tags or as attributes of a tag. For instance, Text could be changed to <Text text="Franco"/>. In the same way, one could write <RealNumber>1.1</RealNumber>.

The XML is versatile, although it can be verbose (that is, long to write). For greater convenience of writing and performance, newer formats are more concise and easier to process than XML. On the other hand, XML provides additional features such as schemas and namespaces that are rare in other formats. Programmers can define schemas as a way to validate files following the proposed schema. They are useful to guarantee the validity of files with data for a given domain or problem.

JavaScript Object Notation (JSON)

Nowadays, JavaScript Object Notation (JSON) is one of the most popular format for structured text files. The format can be easily used with the JavaScript language. As its name suggests, the format is similar to JavaScript Objects.

{
  "Texts": ["Olá", "meu", "nome", "é", "Franco", "Tudo bem?"],
  "Intger Numbers": [1, 2, 3, 4, 5, 6],
  "Real Numbers": [1.1, 2.2, 3.3, 4.4, 5.5, 6.6]
}

In other words, you already know how to use them. JSON can have arrays, dictionaries and primitive data types as values. Integer or real values are written without (single or double) quotes. Strings or logic values require using quotes. It should be noted that JSON does not allow using comments.

The format specification is available at the official page. Modern programming languages often provide ready to use JSON implementations in the standard library. Otherwise, it is very likely that there exists a library to use the format in the chosen programming languages.

Furthermore, there are tools to write JSON files such as DEPOT. The interface of such tools resemble those of spreadsheets programs.

To operate JSON files in the command line, the jq program is very convenient for searches and data extraction.

YAML Ain't Markup Language (YAML)

JSON is a good format for programming, though it is quite close to the structures used in programming languages. There are formats that aim to more accessible to end end-users. One of them is called YAML Ain't Markup Language (YAML), which specification is available at the official page. The contents of the official page is an example of a file in the format itself.

%YAML 1.2
---
Texts:
  - Olá
  - meu
  - nome
  - é
  - Franco
  - Tudo bem?
Integer Numbers:
  - 1
  - 2
  - 3
  - 4
  - 5
  - 6
Real Numbers:
  - 1.1
  - 2.2
  - 3.3
  - 4.4
  - 5.5
  - 6.6

The YAML format resembles a structured list created in a text editor.

Tom's Obvious Minimal Language (TOML)

Tom's Obvious Minimal Language (TOML) is another format of structured text file. The format resembles configuration files used on Windows. The specification is available at the official page of the project.

Texts = ["Olá", "meu", "nome", "é", "Franco", "Tudo bem?"]
IntegerNumbers = [1, 2, 3, 4, 5, 6]
RealNumbers = [1.1, 2.2, 3.3, 4.4, 5.5, 6.6]

In particular, Godot Engine uses a format that resembles TOML to store scenes created in the editor in a text format.

Binary Files

Besides text files, data can be stored as direct memory dumps. In other words, instead of encoding the content as text, the bytes stored in the primary memory are saved in secondary memory. Such files are called binary files (even considering that, technically, text files are also encoded binary files).

Differently than text files, the goal of binary file is not to be readable by human beings. There are no marks of where a data starts or ends; the stored data is simply of a sequence of bits stored in the memory. Thus, to exactly determine the contents of a binary file, one must know the order and the type of data in the sequence that they are stored. If this information is unknown, the identification of values in a binary file require efforts of reverse engineering to determine what is stored.

A binary file can be thought as a big data array. However, instead of predefined positions, each datum takes a certain amount of bits or bytes. To extract the value of a datum, one must read a certain number of bytes from an initial position until a final one. Thus, binary files allow reading and writing data using random access; if one knows the "address" of the value in a file, she/he can access and/or modify it. In other words, a displacement (offset) from a known position (such as the beginning, end or the current position) allows writing or extracting data in arbitrary positions of the file. To perform the displacement, it suffices to perform a seek operation. This makes it easier to extract data from binary files.

In fact, a combination of binary files with records (in the sense of Plain Old Data or POD) can make it trivial to store and recover data in some programming languages. Nevertheless, this is often easier to do in lower level programming languages (such as C and C++) than in higher level ones (such as JavaScript, Python and Lua). The reason is that, in low level programming languages, data types usually have fixed sizes in bytes, which make it easier to obtain the size of a POD record.

Hexadecimal Editor (Hex Editor)

Text editors can read and modify text files. Hexadecimal editors can read and modify binary files.

Unlike text editors that (even if simple) are usually provided by default in any operating system, normally it is necessary to install a hexadecimal editor. Some graphical environments include hexadecimal editors among the default programs. For instance, KDE provides the editor Okteta. Okteta can be used and installed in Windows and Linux operating systems.

An open source alternative that is compatible with Windows, Linux and macOS is called ImHex. There are also online tools, such as HexEd.it (which can be used for no cost, though it is not open source).

Finally, some text editors include modes to act as hexadecimal editors. For instance, GNU Emacs provides hexl-mode to view and edit hexadecimal files.

In this topic, the ImHex editor will be used to inspect binary files. It will be chosen because it allows defining values based on offsets ("memory address") e highlight them with different colors. The syntax to define that data is available at the documentation of the program.

Basic Operations Using Binary Files

Operations using binary files are similar to those performed to text file, though the resulting file is harder to view than a text file. Hence the suggestion to use a hexadecimal editor.

Similarly to text files, every use of a binary file starts by opening (or creating) a file, and ends by closing it. The main differences are the read and write operations. Instead of characters or lines (any kind of encoded text), the operations work memory blocks with sizes defined in bytes.

Writing a Binary File

The implementations in Python, Lua and GDScript are simpler than the JavaScript one. The version in JavaScript requires the programmer to build the bytes to be written in the file. Python, Lua and GDScript provide subroutines for automatic conversion, making the process easier. The version is GDScript is the simplest to read, because the language provides specific methods for each data type. Thus, it can be interesting to read the GDScript code first, then the code in Python or Lua, and, lastly, the JavaScript code. As the four implementations are equivalent, this can make it easier to understand the programs.

The Lua implementation requires the version 5.3 or more recent to use string.pack(). For the versions 5.1 and 5.2 of the language, one can use an extension by Lua's author to obtain struct.pack and struct.unpack, that work as string.pack() and string.unpack(). It is possible to change the version the Lua Interpreter used by ZeroBrane Studio in Project, then Lua Interpreter, then Lua 5.3.

The files in the examples have the .bin extension, as it is a common choice for binary files. However, any extension can be chosen. An extension does not modify the contents of a file, which means that it does not define what a file is. An extension acts a tip (heuristic) for the operating system; the extension makes it easier to the operation file choose an appropriate program to open the file. To view the contents of the created file in a text editor, one can choose the extension .txt (which is interesting to do). One can also create her/his own extension, such as adopting an extension .data or .franco.

let file_path = "franco-garcia-written_binary_file.bin"
let contents = []

let text_line = "Olá, meu nome é Franco.\n"
let text_encoder = new TextEncoder()
let encoded_text = text_encoder.encode(text_line)
let enconded_text_size = encoded_text.length

for (let repetitions = 0; repetitions < 2; ++repetitions) {
    let size = new Int32Array(1)
    size[0] = enconded_text_size
    let bytes_size = new Uint8Array(size.buffer)
    for (let byte_size of bytes_size) {
        contents.push(byte_size)
    }

    for (let text_byte of encoded_text) {
        contents.push(text_byte)
    }
}

let integer_numbers = new Int32Array(3)
integer_numbers[0] = 1
integer_numbers[1] = 2
integer_numbers[2] = 3
let bytes_integer_numbers = new Uint8Array(integer_numbers.buffer)
for (let byte_numbers of bytes_integer_numbers) {
    contents.push(byte_numbers)
}

let real_number = new Float32Array(1)
real_number[0] = -1.23
let bytes_real_number = new Uint8Array(real_number.buffer)
for (let byte_number of bytes_real_number) {
    contents.push(byte_number)
}

let bytes = new Uint8Array(contents)
let data = new Blob([bytes], {type: "application/octet-stream"})
let file = new File([data], file_path, {type: data.type})
console.log("File created successfully.")

let download_link = document.createElement("a")
download_link.target = "_blank"
download_link.href = URL.createObjectURL(file)
download_link.download = file.name
if (confirm("Do you want to download the file '" + file.name + "'?")) {
    download_link.click()
    // In this case, revokeObjectURL() can be used both for confirmation
    // and for cancelling.
    URL.revokeObjectURL(download_link.href)
}
import io
import struct
import sys

try:
    file_path = "franco-garcia-written_binary_file.bin"
    file = open(file_path, "wb")

    text_line = "Olá, meu nome é Franco.\n"
    encoded_text = bytearray(text_line.encode("utf-8"))
    enconded_text_size = len(encoded_text)
    file.write(struct.pack("i", enconded_text_size))
    file.write(encoded_text)

    file.write(struct.pack("i", enconded_text_size))
    file.write(encoded_text)

    file.write(struct.pack("i", 1))
    file.write(struct.pack("i", 2))
    file.write(struct.pack("i", 3))

    file.write(struct.pack("f", -1.23))

    file.close()

    print("File created successfully.")
except IOError as exception:
    print("Error when trying to create the binary file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to create the binary file.", file=sys.stderr)
    print(exception)
local EXIT_FAILURE = 1

local file_path = "franco-garcia-written_binary_file.bin"
local file = io.open(file_path, "wb")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to create the binary file.")
    os.exit(EXIT_FAILURE)
end

local text_line = "Olá, meu nome é Franco.\n"
file:write(string.pack("s", text_line))
file:write(string.pack("s", text_line))

file:write(string.pack("i", 1))
file:write(string.pack("i", 2))
file:write(string.pack("i", 3))

file:write(string.pack("f", -1.23))

io.close(file)

print("File created successfully.")
extends Node

const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-written_binary_file.bin"
    var file = File.new()
    if (file.open(file_path, File.WRITE) != OK):
        printerr("Error when trying to create the binary file.")
        get_tree().quit(EXIT_FAILURE)

    file.store_pascal_string("Olá, meu nome é Franco.\n")
    file.store_pascal_string("Olá, meu nome é Franco.\n")

    file.store_32(1)
    file.store_32(2)
    file.store_32(3)

    file.store_float(-1.23)

    file.close()

    print("File created successfully.")

In each program, the size of the string was written before the phrase. This provides the correct number of bytes to extract the data in programs which read the file. In compiled programming languages, such as C and C++, one can improve this technique and include the size as the first position on an array. The technique is called Pascal string, for it was popularized in implementation of the Pascal programming language.

For the implementation of the example, the opening and closure of files occurs similarly to text files. In Python and Lua, a "b" is used when open ("wb") to create a binary file in write mode. In JavaScript, the MIME type changes to "application/octet-stream" (though the format will be defined by the creation process). In GDScript, the file is created as normal.

The writing varies depending on the language. The simplest among the languages considered for examples is GDScript, which provides specific methods for each data type. The method store_pascal_string() (documentation) allows writing a string which is preceded by an integer value with its size; store_32() (documentation) can write integer numbers with 4 bytes (32 bits); store_float() (documentation) can write real numbers in floating point with 4 bytes (32 bits).

Python and Lua work similarly, with a subroutine to generate sequences of bytes (byte arrays) for each type. The principle is creating compatible types with C struct (records).

Python provides the module struct (documentation); the method pack() (documentation) can create binary sequences. Each parameter corresponds to a data type; "i" is for 4 bytes integer number; "f" is for 4 bytes float point number. Strings needed to be encoded (for instance, in UTF-8) with encode() (documentation), and a bytearray (documentation) must be created.

In Lua, the version 5.3 introduced string.pack() (documentation) for the same purpose. The parameter "s" writes the size and the string; "i" writes a default size integer number; "f" writes a default size float point number. In the author's machine, the default size was 4 bytes for both cases.

JavaScript is the language that requires the highest efforts, because there is no subroutine for direction type conversion. In JavaScript, one must create a fixed size array with the desired values, then convert the array to a byte array (Uint8), then insert each value in an array to write in the file. Assuming 4 bytes (32 bits) values:

  • For signed integer numbers, one should first create an Int32Array (documentation), then store the desired values, then create an Uint8Array (documentation).
  • For real numbers, one should first create an Float32Array (documentation), then store the desired values, then create an Uint8Array;
  • For string, one first create a TextEncoder (documentation), then store the desired values, then create an Uint8Array. As in the other cases, it is also interesting to store the size of the string before the bytes of the text.

To make the operations easier in JavaScript, it can be wroth to create functions to perform the conversion. This will be done later for some examples, such as for the creation of a sound file (functions prefixed with pack_) e to mount values read from files (functions prefixed with unpack_). The functions were not created for this first example to demonstrate that it is possible to create a larger value and convert it at once.

Furthermore, JavaScript uses a Blob (documentation) to store the bytes before saving them. The term blob is used in programming to refer to binary date, without a type interpretation. In this case, they are all the bytes that will be saved.

The use of types with a fixed quantity of bytes can save memory when saving files. For instance, the values 0, 1, 0000001, -123, 123456, 2147483647 and -2147483648 require 4 bytes of memory as 32 bits integer numbers. In text, any value with 5 or more digits (including the signal, decimal dots, commas or leading zeros) would require, at minimum, the same number of bytes for each used character. For instance, -2147483648 has 11 characters, which would require 11 bytes when encoded in ASCII or UTF-8. The same applies to floating point numbers.

A second advantage is the greater ease for data extraction. As all sizes are known, one can determine where each of them starts and ends. To load the created file, it suffices to follow the same creation order, though reading the data instead of writing it. If the size of a string is saved, it can be read to learn the exact number of characters for each saved text.

Byte Order and Endianness

Unlike text files, the binary files generated for each previous example can be different. This can because they depend, for instance, of the number of bytes that has been chosen (or defined by the language) to represent each data type. Besides, binary files (and the memory of a computer itself) can adopt one between two orders for bytes: little-endian (LE) and big-endian (BE). In big-endian order, the most significant byte is stored at the lowest memory address. In little-endian order, the most significant byte is stored at the highest memory address. This means that, in the little-endian order, the bytes are stored in an inverted order in memory. The little-endian is common for processors of the x86 and AMD64 (x64) architectures. The big-endian order is common for network operations (network order).

Strictly speaking, the order of bytes may also affect text files. For instance, files encoded in Unicode can have an initial value called Byte Order Mark (BOM), which can be used to determine whether the file uses the little-endian or big-endian order. The values for each Unicode encoding (such as UTF-8) can be found in this Wikipedia article. BOM may be optional, mandatory or prohibited depending on the adopted format. For UTF-8 codification, it is optional (for instance, the examples from this topic do not use BOM). Regardless of the case, if they exist, the values of BOM should not be shown by the text editor (or the program processing the file); they are only useful to learn how to decode the rest of the bytes of the file.

To illustrate the difference due to byte order, the following table can be useful. It was extracted from the general questions relating to UTF or Encoding Form.

BytesEncoding Form
00 00 FE FFUTF-32, big-endian
FF FE 00 00UTF-32, little-endian
FE FFUTF-16, big-endian
FF FEUTF-16, little-endian
EF BB BFUTF-8

A byte has 8 bits. Two hexadecimal digits can represent 256 values, which ranges from 00 (0 as decimal) to FF (255 as decimal). In other words, two hexadecimal digits correspond to a byte. As a curiosity, a single hexadecimal digit corresponds to 4 bits (half a byte), called a nibble (or nybble or nyble). As a piece of useful information, it is easy converting a hexadecimal value to binary. To do this, it suffices to write the value of each hexadecimal digit in binary format. For instance, F corresponds to 1111; E corresponds to 1110. Thus, FE corresponds to 11111110. Similarly, a sequence of nibbles can be converted to a hexadecimal value. For instance, 1011 in binary correspond to the hexadecimal value B.

With the previous information, the table can be analyzed. In particular, the entries for UTF-32 and UTF-16 allow observing the different due to byte order with ease. The lines for UTF-32 represent a same 4 bytes integer number: 00 00 FE FF (hexadecimal value; as a decimal value: 65279). In the big-endian version, the value is stored in the order that it is written. In the little-endian, the bytes appear in an inverted order. The interpretation of the value as an unsigned integer value would result int the value 4294836224. In the case of BOM, reading the value would allow identifying the byte order of the file (for instance, by comparing the read value with each one of the possible results). The same applies to the entries for UTF-16.

In the case of UTF-8, three bytes are used as three numbers with 1 byte each (in decimal: 239, 187, 191). As UTF-8 encodes values in sequences of 1 byte, the byte order does not affect the result. After all, it is not possible to invert the byte order if there exists only one byte per value. For instance, FE (254 in decimal) is a single byte. FE inverted is still FE, for the inversion does not apply to individual bits, but to whole bytes. Thus, the utility of the BOM in UTF-8 files is to show that the file is encoded in UTF-8 (instead of another encoding, such as ISO-8859-1, which encodes the Latin alphabet). In other words, it serves as an additional guarantee to recognize the file.

In programs, the interpreter or compiler is the responsible to store values in the correct byte order in memory. However, when one creates binary file, she/he must know the adopted order if she/he wants to share the created file in machines with different architectures. Otherwise, the values that were read can be interpreted incorrectly, resulting in potentially unexpected errors.

Inspecting the Created Binary File

Although it is possible to open a binary file (that is not a text file) in a text editor (actually, any file can be opened in a text editor), the result will be peculiar.

For instance, one can try opening the file franco-garcia-written_binary_file.bin in a text editor. Some characters will be correct, others will be strange. This happens because the text editor tries to interpret all the data as encoded characters, which does reflect the real content of the file. For instance, the following image illustrates the created binary file created in the Python implementation, open in the text editor GNU Emacs. The left side of the image shows the result interpreted as text. The right part of the image illustrate the file as shown in hexadecimal using the mode hexl-mode. In the hexadecimal mode, the numbers in lines and columns are offsets to the accessed memory. The values displayed are hexadecimal numbers representing the stored bytes in little-endian order.

Binary file `franco-garcia-written_binary_file.bin` open as a text file (on the left side of the image) and in a hexadecimal inspection mode (on the right side of the image) in the text editor GNU Emacs. The left image show many values that are not characters, written as escape values. The right image shows correct values for ASCII characters (which means the characters that do not have accents).

On the left side of the image, only values encoded using valid ASCII values are displayed correctly. Accented characters are shown by escape values (hence the choice of using the Portuguese phrase in this topic). The stored integer numbers and the real number are also shown as escape value. On the right side of the image, with the hexadecimal visualization, are values for the bytes are correct. Once again, values corresponding to ASCII text are displayed correctly in the part interpreted as such (the far right of the image). The other values require knowing the stored data type. For instance, in the last line, the sequence 0200 0000 in little-endian corresponds to the hexadecimal sequence 0000 0002 in big-endian, which is equivalent to the integer number 2.

Thus, to correctly interpret the stored data, it is necessary to know where each datum starts and ends, as well as the stored data type. To do this, one can use a hexadecimal editor. The next image inspects the contents of the file generate by the Python program using suing using the hexadecimal editor ImHex. The hexadecimal editor display binary values stored in the file as hexadecimal numbers (hence the name). Case there is a need to view it another way (for instance, as binary values), the editor can be configured to change the way it displays values.

Inspection of the binary file `franco-garcia-written_binary_file.bin` generated in Python using the hexadecimal editor ImHex.

In the image, each datum is highlighted to make easy to identify the value. The matching between patterns uses the options Pattern Editor and Pattern Data in the ImHex editor. When the values are correct, the values for data are interpreted correctly. The transcription of the data types are provided in the next block. The type s32 is a signed 32 bits (4 bytes) integer; char is a 1 byte character; float is a 32 bits (4 bytes) real number.

s32 phrase1_size @ 0x00;
char phrase1[26] @ 0x04;
s32 phrase2_size @ 0x1E;
char phrase2[26] @ 0x22;
s32 integer1 @ 0x3C;
s32 integer2 @ 0x40;
s32 integer3 @ 0x44;
float real @0x48;

The value after the at sign (@) corresponds to the offset of the datum from the start of the file. This means, for instance, that phrase2_size starts at the byte with address 1E16 in hexadecimal base, which is equal to 3010 in the decimal base. If a read starts in the offset 1E and reads 4 bytes, the extracted value will be the size of the second phrase.

The type definition allows displaying the expected values.

NameTypeValue
phrase1_sizes3226
phrase1StringOlá, meu nome é Franco.
phrase2_sizes3226
phrase1StringOlá, meu nome é Franco.
integer1s321
integer2s322
integer3s323
realfloat-1.23

The size used for each array is the size of the string considering the accents (for the file is encoded in UTF-8).

For the JavaScript and GDScript implementations, the type definition proposed for Python should work (although it might require changing the byte order in JavaScript). For the Lua implementation, it may be necessary to adjust formats and addresses, because string.pack("s") uses a 8 bytes unsigned integer in 64-bit machines to store the size of the string. This corresponds to the type u64 in ImHex.

u64 phrase1_size @ 0x00;
char phrase1[26] @ 0x08;
u64 phrase2_size @ 0x22;
char phrase2[26] @ 0x2A;
s32 integer1 @ 0x44;
s32 integer2 @ 0x48;
s32 integer3 @ 0x4C;
float real @0x50;

In 32-bit machines, the definition for the other languages can work in Lua, as a 4 bytes integer would be used to store the size of the strings, as it happens on the other cases.

Reading a Binary File

There are two main ways of reading a binary file:

  1. Read the whole file as a memory block, potentially abstracted as an array of bytes;
  2. Extract data from the binary file.

It is also possible to read the file byte per byte, though this is not always useful. One of the examples in this topic demonstrates how to read the file byte per byte to copy files.

Reading the Whole Binary File

Some programming languages allow reading an entire file at once and extract all the data as fields in a record. For instance, in C and C++, a single function call can read e load all data saved in a file to a POD record at once (it is also possible to save all the data in a record in a single call). This happens because the memory can be manipulated as a block of bytes. If the data is stored at continuous addresses, the memory can be saved and restored as if it was a single block (because it is, in a way). Assuming that the data layout is the same, one can save and restore multiple variables on which the addresses start by a base address.

In JavaScript, Python, Lua and GDScript, this is not immediate, as the languages abstract the use of the memory. Although JavaScript does read the entire file, the result is an array of bytes, not a memory block that can be interpreted according to a programmer's will.

Reading a Binary File and Extracting Data

Although it is not possible to load all data at once, it is still simple to extract each of the saved datum. The implementations in Python, Lua and GDScript are simpler than the JavaScript one. The JavaScript implementation requires converting the stored bytes in variables of intermediate types before loading them in memory. It also requires an HTML page to send the file in browsers; the code is presented in the sequence of the text.

The implementation of Lua assumes the version 5.3 or newer, as commented for the creation of binary files.

// This file must be saved in a file called "script.js".
// It will be processed by an HTML page with code to send the
// text file by a form.

// <https://francogarcia.com/en/blog/development-environments-javascript/>
function add_element(value, element_name = "p") {
    const parent = document.getElementById("contents")
    const new_element = document.createElement(element_name)
    new_element.innerHTML = value
    parent.appendChild(new_element)
}

function unpack_int32(bytes) {
    var array_buffer = new ArrayBuffer(4)
    var result = new DataView(array_buffer)
    for (let index in bytes) {
        result.setUint8(index, bytes[index])
    }

    // true for little-endian, false for big-endian.
    return result.getInt32(0, true)
}

function unpack_float32(bytes) {
    var array_buffer = new ArrayBuffer(4)
    var result = new DataView(array_buffer)
    for (let index in bytes) {
        result.setUint8(index, bytes[index])
    }

    // true for little-endian, false for big-endian.
    return result.getFloat32(0, true)
}

function read_file(binary_file) {
    if (!binary_file) {
        return
    } else if (!(binary_file instanceof File)) {
        return
    }

    let file_reader = new FileReader()
    file_reader.onload = function(event) {
        let bytes = new Uint8Array(event.target.result)
        let index = 0

        let text_decoder = new TextDecoder()

        let size_first_phrase = unpack_int32(new Uint8Array(bytes.slice(index, 4)))
        index += 4
        let first_phrase = text_decoder.decode(bytes.slice(index, index + size_first_phrase))
        index += size_first_phrase

        let size_second_phrase = unpack_int32(new Uint8Array(bytes.slice(index, index + 4)))
        index += 4
        let second_phrase = text_decoder.decode(bytes.slice(index, index + size_second_phrase))
        index += size_second_phrase

        let integer_numbers = []
        let integer_numbers_sum = 0
        for (let i = 0; i < 3; ++i) {
            let number = unpack_int32(new Uint8Array(bytes.slice(index, index + 4)))
            index += 4
            integer_numbers.push(number)
            integer_numbers_sum += number
        }

        console.log(index, bytes)
        let real_number = unpack_float32(new Uint8Array(bytes.slice(index, index + 4)))
        index += 4

        console.log(first_phrase)
        console.log(second_phrase)
        console.log(integer_numbers, integer_numbers_sum)
        console.log(real_number, "Positive number?", real_number > 0)

        add_element(first_phrase)
        add_element(second_phrase)
        add_element("[" + integer_numbers + "] " + integer_numbers_sum)
        add_element(real_number + " " +  "Positive number?" + " " + (real_number > 0))
    }
    file_reader.readAsArrayBuffer(binary_file)

    // Disallows the submission of the form, allowing the visualization of the
    // result of add_element() in the same page.
    return false
}
import io
import struct
import sys

try:
    file_path = "franco-garcia-written_binary_file.bin"
    file = open(file_path, "rb")

    # The comma is important, because string.unpack() returns a tuple.
    size_first_phrase, = struct.unpack("i", file.read(struct.calcsize("i")))
    encoded_text = file.read(size_first_phrase)
    first_phrase = encoded_text.decode("utf-8")

    size_second_phrase, = struct.unpack("i", file.read(struct.calcsize("i")))
    encoded_text = file.read(size_second_phrase)
    second_phrase = encoded_text.decode("utf-8")

    integer_numbers = []
    integer_numbers_sum = 0
    for i in range(3):
        number, = struct.unpack("i", file.read(struct.calcsize("i")))
        integer_numbers.append(number)
        integer_numbers_sum += number

    real_number, = struct.unpack("f", file.read(struct.calcsize("f")))

    file.close()

    # To remove the line break:
    # first_phrase.rstrip()
    print(first_phrase)
    print(second_phrase)
    print(integer_numbers, integer_numbers_sum)
    print(real_number, "Positive number?", real_number > 0)
except IOError as exception:
    print("Error when trying to read the binary file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to read the binary file.", file=sys.stderr)
    print(exception)
function write_indentation(level)
    for indentation = 1, level do
        io.write("  ")
    end
end

function write_table(a_table, level)
    level = level or 1
    if (type(a_table) == "table") then
        io.write("{\n")
        for key, value in parentrs(a_table) do
            write_indentation(level)
            io.write(tostring(key) .. ": ")
            write_table(value, level + 1)
            io.write("\n")
        end
        write_indentation(level - 1)
        io.write("},")
    else
        local quotes = ""
        if (type(a_table) == "string") then
            quotes = "\""
        end
        io.write(quotes .. tostring(a_table) .. quotes .. ",")
    end
end

-- The program starts here.
local EXIT_FAILURE = 1
local file_path = "franco-garcia-written_binary_file.bin"
local file = io.open(file_path, "rb")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to read the binary file.")
    os.exit(EXIT_FAILURE)
end

local size_first_phrase = string.unpack("T", file:read(string.packsize("T")))
print(size_first_phrase)
local first_phrase = file:read(size_first_phrase)

local size_second_phrase = string.unpack("T", file:read(string.packsize("T")))
print(size_first_phrase)
local second_phrase = file:read(size_second_phrase)

local integer_numbers = {}
local integer_numbers_sum = 0
for index_number = 1, 3 do
    local number = string.unpack("i", file:read(string.packsize("i")))
    table.insert(integer_numbers, number)
    integer_numbers_sum = integer_numbers_sum + number
end

local real_number = string.unpack("f", file:read(string.packsize("f")))

io.close(file)

-- To remove the line break:
-- string.sub(first_phrase, 1, size_first_phrase - 1)
print(first_phrase)
print(second_phrase)
write_table(integer_numbers)
print(" " .. integer_numbers_sum)
print(real_number, "Positive number?", real_number > 0)
extends Node

const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-written_binary_file.bin"
    var file = File.new()
    if (file.open(file_path, File.READ) != OK):
        printerr("Error when trying to read the binary file.")
        get_tree().quit(EXIT_FAILURE)

    var first_phrase = file.get_pascal_string()
    var second_phrase = file.get_pascal_string()

    var integer_numbers = []
    var integer_numbers_sum = 0
    for i in range(3):
        var number = file.get_32()
        integer_numbers.append(number)
        integer_numbers_sum += number

    var real_number = file.get_float()

    file.close()

    # To remove the line break:
    # first_phrase.rstrip("\n")
    print(first_phrase)
    print(second_phrase)
    printt(integer_numbers, integer_numbers_sum)
    printt(real_number, "Positive number?", real_number > 0)

For the JavaScript code, it is necessary creating an HTML page to send the file.

<!DOCTYPE html>
<html lang="pt-BR">

  <head>
    <meta charset="utf-8">
    <title>Binary File Read</title>
    <meta name="author" content="Franco Eusébio Garcia">
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>

  <body>
    <header>
      <h1>File Read</h1>
    </header>

    <main>
      <!-- Form to send the text file. -->
      <form method="post"
            enctype="multipart/form-data"
            onsubmit="return read_file(files.files[0])">
        <label for="files">Choose a binary file:</label>
        <input id="files"
               name="files"
               type="file"
               accept="application/octet-stream"/>
        <input type="submit"/>
      </form>

      <div id="contents">
      </div>

      <!-- The name of the JavaScript file must match the one defined below. -->
      <script src="./script.js"></script>
    </main>
  </body>

</html>

Now the source code can be discussed. The programs are simple; however, the operation required to work with raw bytes in JavaScript, Python and Lua languages make the solution harder than it really is. It might be worth comparing each implementation with the GDScript, because the codes do exactly the same operations (usually in the same order).

A note: each program will print a line break at the end of every phrase. This is not an error; the line break was stored in the file and recovered during the read. As the original program has saved strings with a line break at the end, they were recovered at the last position of the read values. It is possible to remove them after reading the values, or, if they are not desired, save the phrases without an ending line break on strings.

As it happens with creation of binary files, subroutines to open files often use a flag or value to represent a binary file instead of a text file). The "rb" used in some implementations serve for this purpose.

In Python, read() (documentation) can read bytes from a file. The parameter is the desired number of bytes to read; struct.calcsize() (documentation) provides sizes (in bytes) for the chosen primitive types. The method unpack() (documentation) converts the read data into the chosen data types provided in the first parameter. For text, decode() (documentation) convert a sequence of bytes into a string.

In Lua, io.read() or file:read() (documentation) can read bytes, if a number is passed as a parameter. As it happens with Python, the implementation uses the size of the desired type, provided by string.packsize() (documentation); to convert the bytes into data for type, string.unpack() (documentation) can be used.

In GDScript, it is simple to manipulate binary files, because there are methods to abstract the conversions. To read data, one should use the get_ method corresponding to the set_ method used to store the data in the file. Thus, for instance, the method get_pascal_string() (documentation) read a sequence of an integer number with the size and the bytes to load a string; get_32() (documentation) loads a (unsigned) integer number with 4 bytes (32 bits; in a later example, the implementation of uint32_to_int32() will show how to recover the original sign); get_float() (documentation) loads a 4 bytes floating point number.

The JavaScript version requires more work than the others. FileReader can read a file, as in text files. However, the chosen method to read data is readAsArrayBuffer() (documentation), which provides an array of bytes. The array is saved in a Uint8Array (documentation), which is an array of unsigned one byte integers (in other words, an array of bytes). The remainder of the solution converts sequences of bytes stored in the array to other data types. The proposed unpack_int32() function creates a 4 bytes integer (32 bits; hence the value 4 provided to ArrayBuffer) using an array with four bytes, created by slice(). The values are stored in an ArrayBuffer (documentation), which is manipulated by a DataView (documentation) to set up the integer number, storing each of the four bytes with setUint8() (documentation), and reading the resulting value as a 32 bits integer with getInt32() (documentation). The function unpack_float32() works similarly, though it interprets the value as a 32 bits floating number as a real number getFloat32() (documentation). To decode coded bytes of a string, one can use decode() (documentation) from TextDecoder (documentation). It should be noted that index is incremented with the number of bytes processed after each operation, with the purpose of setting the position to read the next expected value at its initial index at the array.

The JavaScript examples reinforces the idea that programming languages are tools, and that is better to choose the best tool for each problem. The analysis of the example allows suggesting that, although possible, JavaScript is not a very convenient language to manipulate bytes. Other languages can be better options for such operations. For instance, Python, Lua and GDScript provide some features to manipulate data, with abstractions as subroutines. If one really wishes to use JavaScript for byte manipulation, it is convenient to define similar subroutines On the other hand, they can still require more efforts to manipulate bytes than using lower level languages such as C and C++, on which it would suffice to inform the compiler how the sequence of bytes should be interpreted.

Techniques Using Files

There are advanced techniques and resources with files, such as memory-mapped files, pagination, and non-block input and output (IO) operations (asynchronous; synchronous IO). There are convenient features for security, such as file locking. There are also practical techniques to store and retrieve data, especially for lower level programming languages.

As this is an introductory topic, the next sections highlight some simpler techniques, which can be used in all programs. Some may be closer to tips or advice than proper techniques, though they are still useful.

Check If a File Exists Before Creating a New One

Opening a file using write mode is potentially destructive operations, because it truncates (erase) existing data if there already exists a file with the chosen path. There two traditional ways to avoid the problem:

  1. The first is using a file system library that provides a function to check whether a file with the same name exists in the provided path. The support this options, however, depends on the standard library or external libraries used as dependencies. For instance, Python provides several modules for this purpose. GDScript provides the Directory (documentation) class.

  2. The second works in any programming language that supports files. Before creating a new file, one can try opening it. If the operation fails, there does not exist a file the chosen path. Thus, she/he can create a new file without risking data loss. Otherwise, there already exists a file. In this case, the program can request a confirmation for the action, preferably with a warning about data loss. Another option is opening the file in read and write mode, to preserver existing data and insert new content, or in append mode, to add new content at the end of the file.

    There also exists a technique derived from the second way, if one desired always to use the way of manipulating files. If a file does not exist, she/he creates one. After creation, the new file is closed and reopened in read and write mode. If the file does already exist, it is also opened in read and write mode. This approach will be required for GDScript, it does not provide an append mode.

The next example uses the second way to:

  • If the files does not exist, create one with a message: Olá, meu nome é Franco!;
  • If the file already exists, open it using append mode, adding a new exclamation mark at the end of the file.

The append mode for writing files does not destruct the original file; however, in this example, it is used for illustrative purposes, for it has not yet been used.

As the example does not apply to JavaScript (for browsers), it only will be presented to the other languages.

import io
import sys

file_path = "franco-garcia-file_reuse.txt"

try:
    file = open(file_path, "r")
    file.close()

    file = open(file_path, "a")
    file.write("!")
    file.close()

    print("File updated successfully.")
except IOError as exception:
    try:
        file = open(file_path, "w")
        file.write("Olá, meu nome é Franco!")
        file.close()

        print("File created successfully.")
    except IOError as exception:
        print("Error when trying to create the text file.", file=sys.stderr)
        print(exception)
    except OSError as exception:
        print("Error when trying to create the text file.", file=sys.stderr)
        print(exception)
except OSError as exception:
    print("Error when trying to update the text file.", file=sys.stderr)
    print(exception)
local file_path = "franco-garcia-file_reuse.txt"
local file = io.open(file_path, "r")
if (file == nil) then
    file = io.open(file_path, "w")
    file:write("Olá, meu nome é Franco!")
    io.close(file)

    print("File created successfully.")
else
    io.close(file)
    file = io.open(file_path, "a")
    file:write("!")
    io.close(file)

    print("File updated successfully.")
end
extends Node

func _ready():
    var file_path = "franco-garcia-file_reuse.txt"
    var file = File.new()
    if (file.open(file_path, File.READ) != OK):
        file.open(file_path, File.WRITE)
        file.store_string("Olá, meu nome é Franco!")
        file.close()

        print("File created successfully.")
    else:
        file.close()

        file.open(file_path, File.READ_WRITE)
        file.seek_end()
        file.store_string("!")
        file.close()

        print("File updated successfully.")

The versions in Python and Lua use the "a" mode to append to the text file. In Python, one can use write() instead of writelines() to write a single line. It is also possible to use writelines(["Mensagem"]), if one prefers.

As GDScript does not provide the mode, the alternative is opening the file to read and updating i, using File.READ_WRITE (File.WRITE_READ should not be used for this purpose, as it removes the existing data similarly to File.WRITE). Then, seek_end() (documentation) allows advancing to the end of the file.

In the first use of the program (or when running it after erasing the file), the program will create and write Olá, meu nome é Franco! in the file. In the next uses (assuming that the file exists), it will add an exclamation mark to the end of the file. For instance, after running the program twice after the file creation, the file will store Olá, meu nome é Franco!!!.

Editing an Existing File: Inserting New Content

There are two main ways of modifying an existing file: the simple way and the more complicated one. The simple way consists of reading the whole file to the primary memory, modify the content in primary memory, and rewrite the original file using the modified copy.

The more complicate way consists of performing an in-place change. To preserve the existing content in the file, one must:

  1. Advance until the point of the new insertion;
  2. Save the content from this point until the end of the file (in a variable or another file);
  3. Add the new content;
  4. Restore the saved content after the new content.

This is necessary because writing new data in any position that is not the end of the file can result in overwriting values. To understand what happens, you can open a text editor and write any phrase. Next, press the key Insert (Ins) in the keyboard and write in any position before the end of the file. Instead of adding new characters, the text will be modified with what is typed. To restore the original behavior, you should press Insert again.

For programming with files, there are libraries that make the operation easier. For instance, in Python, there is the fileinput (documentation) module to perform in-place modifications of text files. As the next example is specific to Python, the implementation uses the automatic closure.

import fileinput

file_path = "franco-garcia-in_place_modification.txt"
try:
    with open(file_path, "w", encoding="utf-8") as file:
        file.writelines([
            "Olá, meu nome é Franco!\n",
            "Tudo bem?\n",
            # Tchau means Bye.
            "Tchau!\n"
    ])

    with fileinput.FileInput(file_path, inplace = True) as file:
        for line in file:
            # print() is changed to modify the original file
            # line stores the original line.
            print("[Franco] ", line, sep="", end="")

    print("File created and modified successfully.")
except IOError as exception:
    print("Error manipulating file.")
    print(exception)
except OSError as exception:
    print("Error manipulating file.")
    print(exception)

On the other hand, the modification can be simpler provided that the number of bytes of the change replaces the same number of bytes in the original content. Thus, to make modifications easier, one can define file formats with fixed and predefined size structures. With binary files, this is simpler: it suffices to define the expected sizes for each part of the file (preferable using a record); the technique will be commented in a section, using an array as an example. With text files, the technique is also possible, though more restricted. To define content with a fixed length, one can add spaces or other characters considered as neutral (or empty) to fill gaps.

For instance, to store a line with a name, there could be a maximum limit of 15 characters. If a name required less space, the text is filled with the chosen as neutral to complete 15 characters (it should be noted that characters with accents may take multiple bytes). or ............... could be two possibilities for an empty name. The first has 15 spaces; the second has 15 dots. Franco or Franco......... are two examples of a name that fits the space. A change to Franco Garcia or (Franco Garcia..) would be possible in the available space. The change can also remove content. A change to Garcia or Garcia......... would also be valid and fit the available space.

The same applies to numbers. In this case, there would exist a maximum limit for a combination of the quantity of digits, decimal dot, sign, etc. For instance, 12345, 1.234, -1.23, 0 or 0 are numbers encoded in text with exactly 5 characters.

In short, when possible, it is easier to use the simple way and rewrite the entire contents of the file. Although it can seem rudimentary, it is often used by many programs. The next section provides an example of how to explore it to generate back-ups.

Binary File: Random Access and Modifications

Binary files can be manipulated as if it stored values in an array or record. To do this, one can use seek operations for random access.

As JavaScript for browsers read the entire file at once, this section does not provide an example for the language. The reason is that, to use the example, the position of the file should be changed using a seek() subroutine. For a more compact example, assertions (assert()) are used instead of handling errors. In a real program, a suitable error handling is essential. To make it easier to read the program, the implementation has been divided in the procedures create_file(), modify_file() and print_file(), called in this order.

import io
import struct
import sys

INTEGER_SIZE = struct.calcsize("i")

# index 0 corresponds to the first value.
def access_value(file, index):
    # The increment by INTEGER_SIZE allows to skip the value
    # stored with the total number count, to read the initial position
    # of the values of the array.
    header_offset = INTEGER_SIZE
    numbers_offset = header_offset + index * INTEGER_SIZE
    file.seek(numbers_offset)

def read_value(file, index):
    access_value(file, index)
    result, = struct.unpack("i", file.read(INTEGER_SIZE))

    return result

def write_value(file, index, value):
    access_value(file, index)
    file.write(struct.pack("i", value))

def create_file(file_path):
    print("Creation of " + file_path)

    file = open(file_path, "wb")
    assert(file)

    total_numbers = 20
    file.write(struct.pack("i", total_numbers))

    for number in range(total_numbers):
        file.write(struct.pack("i", number + 1))

    file.close()

def modify_file(file_path):
    print("Modification of " + file_path)

    file = open(file_path, "rb+")
    assert(file)

    total_numbers, = struct.unpack("i", file.read(INTEGER_SIZE))

    # Access the 11st stored integer value.
    index = 10
    value = read_value(file, index)
    print(value, value == 11)
    # Modify the 11st stored integer value.
    write_value(file, index, 12345)
    value = read_value(file, index)
    print(value, value == 12345)

    write_value(file, 0, -1111)
    write_value(file, 19, 191919)

    file.close()

def print_file(file_path):
    print("Values in " + file_path)

    file = open(file_path, "rb")
    assert(file)

    total_numbers, = struct.unpack("i", file.read(INTEGER_SIZE))
    for index_number in range(total_numbers):
        number, = struct.unpack("i", file.read(INTEGER_SIZE))
        print(number)

    file.close()

def main():
    file_path = "franco-garcia-numbers.bin"
    create_file(file_path)
    print()
    modify_file(file_path)
    print()
    print_file(file_path)

if (__name__ == "__main__"):
    main()
local INTEGER_SIZE = string.packsize("i")

-- index 0 corresponds to the first value.
function access_value(file, index)
    -- The increment by INTEGER_SIZE allows to skip the value
    -- stored with the total number count, to read the initial position
    -- of the values of the array.
    local header_offset = INTEGER_SIZE
    local numbers_offset = header_offset + index * INTEGER_SIZE
    file:seek("set", numbers_offset)
end

function read_value(file, index)
    access_value(file, index)
    local result = string.unpack("i", file:read(INTEGER_SIZE))

    return result
end

function write_value(file, index, value)
    access_value(file, index)
    file:write(string.pack("i", value))
end

function create_file(file_path)
    print("Creation of " .. file_path)

    local file = io.open(file_path, "wb")
    assert(file)

    local total_numbers = 20
    file:write(string.pack("i", total_numbers))

    for number = 1, total_numbers do
        file:write(string.pack("i", number))
    end

    file:close()
end

function modify_file(file_path)
    print("Modification of " .. file_path)

    local file = io.open(file_path, "r+b")
    assert(file)

    local total_numbers = string.unpack("i", file:read(INTEGER_SIZE))

    -- Access the 11st stored integer value.
    local index = 10
    local value = read_value(file, index)
    print(value, value == 11)
    -- Modify the 11st stored integer value.
    write_value(file, index, 12345)
    value = read_value(file, index)
    print(value, value == 12345)

    write_value(file, 0, -1111)
    write_value(file, 19, 191919)

    file:close()
end

function print_file(file_path)
    print("Values in " .. file_path)

    local file = io.open(file_path, "rb")
    assert(file)

    local total_numbers = string.unpack("i", file:read(INTEGER_SIZE))
    for index_number = 1, total_numbers do
        local number = string.unpack("i", file:read(INTEGER_SIZE))
        print(number)
    end

    file:close()
end

function main()
    local file_path = "franco-garcia-numbers.bin"
    create_file(file_path)
    print()
    modify_file(file_path)
    print()
    print_file(file_path)
end

main()
extends Node

# 4 bytes = 32 bits
const INTEGER_SIZE = 4

const INT31_MAX = 1 << 31
const INT32_MAX = 1 << 32

func uint32_to_int32(value):
    var result = (value + INT31_MAX) % INT32_MAX - INT31_MAX

    return result

# index 0 corresponds to the first value.
func access_value(file, index):
    # The increment by INTEGER_SIZE allows to skip the value
    # stored with the total number count, to read the initial position
    # of the values of the array.
    var header_offset = INTEGER_SIZE
    var numbers_offset = header_offset + index * INTEGER_SIZE
    file.seek(numbers_offset)

func read_value(file, index):
    access_value(file, index)
    var result = file.get_32()

    return result

func write_value(file, index, value):
    access_value(file, index)
    file.store_32(value)

func create_file(file_path):
    print("Creation of " + file_path)

    var file = File.new()
    file.open(file_path, File.WRITE)
    assert(file.is_open())

    var total_numbers = 20
    file.store_32(total_numbers)

    for number in range(total_numbers):
        file.store_32(number + 1)

    file.close()

func modify_file(file_path):
    print("Modification of " + file_path)

    var file = File.new()
    file.open(file_path, File.READ_WRITE)
    assert(file.is_open())

    var total_numbers = file.get_32()

    # Access the 11st stored integer value.
    var index = 10
    var value = read_value(file, index)
    printt(value, value == 11)
    # Modify the 11st stored integer value.
    write_value(file, index, 12345)
    value = read_value(file, index)
    printt(value, value == 12345)

    write_value(file, 0, -1111)
    write_value(file, 19, 191919)

    file.close()

func print_file(file_path):
    print("Values in " + file_path)

    var file = File.new()
    file.open(file_path, File.READ)
    assert(file.is_open())

    var total_numbers = file.get_32()
    for index_number in range(total_numbers):
        var number = uint32_to_int32(file.get_32())
        print(number)

    file.close()

func _ready():
    var file_path = "franco-garcia-numbers.bin"
    create_file(file_path)
    print()
    modify_file(file_path)
    print()
    print_file(file_path)

The created file stores the number of integer values stored in the file (as a 4 bytes or 32 bits integer), followed by this quantity of 4 bytes integer numbers. The size provided by total_numbers allows knowing how many numbers the file stores. Alternatively, if the size was omitted and the file stored only number, the quantity of numbers could be determined by dividing the total size of the file by the size of each stored value (4 bytes or 32 bits).

Regardless, the addition of the size (as length) is interesting for conceptual reasons. For convenience, the 4 bytes value was stored in a constant named INTEGER_SIZE.

The most important function in the implementation is called access_value(). It uses seek() to advance the file position. The implementation skips 4 bytes (which corresponds to memory region on which the value for the size is stored in the file), and, then, calculates an offset by multiplying the index by the size of each value (4 bytes). The operation is similar to calculating a memory offset in an array, using a low level feature called pointer. With the function, one can access the stored value as if they were values in an array, though stored in secondary memory (in the file) instead of primary memory, and with the subroutines read_value() and write_value() instead of using the square brackets' operator. As the operation deals with memory offsets from a base address, the "indexation" starts on zero.

The remainder of the implementation consists of creating, modifying and reading the created file. In modify_file(), it is important noticing that the file was opened in read and write mode for binary files. In Lua, "w+b" is used to specify this mode. In Python, one can use "wb+", "w+b" or any other combination of the three symbols. In GDScript, one must use File.READ_WRITE; once again, it is important using File.READ_WRITE instead of File.WRITE_READ, for the goal is not to erase the original content in an existing file, though to keep it.

To read and write text files, the mode is similar, though b is omitted in Lua and Python. In other words, "wb" is used.

For a binary file, editions can be performed in-place provided that the size in bytes of each modified value is the same. This is what allows exploring random access in a specific position both for reading and to write in the file. Provided that the type and size are the same, any valid value for a type can be modified without compromising the structure of the file.

For comparison, text files use sequential access. That means it is necessary to transverse the file until finding the desired value, because each datum can have a different number of stored bytes. In particular, any modification that changes the size of the modified string will affect the sequence of bytes in the file, inhibiting the use of random access. On the other hand, if one reads the previous phrase with attention, she/he can conclude that, provided she/he saves values with the same size of the string representing the value (as mentioned in a previous section), it is possible to simulate random access in a text file.

Returning to the example, the GDScript requires attention when using negative numbers in binary files. As the values are saved and recovered without sign, the number must be reinterpreted after the read if it can be negative. The function uint32_to_int32() has this purpose, which was implemented as an adaptation of the note provided to store_16() (documentation). If the value with the unsigned number belong to the interval interpreted as a negative number in two's complement, the sequence of operations restore the negative sign of the number. If the number is positive, the value is preserved. The used << operator is a bitwise operation of bit shifting. In particular, for positive integer number (in other words, natural numbers), the result of a left bit shift is equivalent to multiply the number by two; however, the operation is faster than calculating a power. Thus, the resulting value corresponds to a for INT31_MAX and for INT32_MAX.

Temporary Copy When Saving (or as a Back-Up of Previous Versions of the File)

One of the easiest ways to edit an existing file is replacing it by a new one with the updated content. A common practice is creating the new file as a temporary file, keeping the original as a back-up. To do this:

  1. Create a new file with a different name than the original one;
  2. If the file is created successfully, rename the new file as back-up. Alternatively, the original file can be deleted (if one does not want a back-up);
  3. Rename the new file with the original name.

Operations to rename and exclude files are common in Application Programming Interfaces (APIs) for file manipulation.

As this kind of operation is not common in browsers (nor safe; it is not a good idea allowing a website to erase files in the computer), the JavaScript version will, once again, be omitted.

import io
import os
import sys

file_path = "franco-garcia-file_back_up.txt"
file = open(file_path, "w")
assert(file)
file.write("Olá, meu nome é Franco!\n")
file.close()

file = open(file_path, "r")
assert(file)
contents = file.read()
file.close()

contents += "Tchau!\n"

file = open(file_path + ".TMP", "w")
assert(file)
file.write(contents)
file.close()

os.rename(file_path, file_path + ".BAK")
# Alternatively, to delete the original file:
# os.remove(file_path)

os.rename(file_path + ".TMP", file_path)

print("File created and modified successfully.")
local file_path = "franco-garcia-file_back_up.txt"
local file = io.open(file_path, "w")
assert(file)
file:write("Olá, meu nome é Franco!\n")
io.close(file)

file = io.open(file_path, "r")
assert(file)
local contents = file:read("*all")
io.close(file)

contents = contents .. "Tchau!\n"

file = io.open(file_path .. ".TMP", "w")
assert(file)
file:write(contents)
io.close(file)

os.rename(file_path, file_path .. ".BAK")
-- Alternatively, to delete the original file:
-- os.remove(file_path)

os.rename(file_path .. ".TMP", file_path)

print("File created and modified successfully.")
extends Node

func _ready():
    var file_path = "franco-garcia-file_back_up.txt"
    var file = File.new()
    file.open(file_path, File.WRITE)
    assert(file.is_open())
    file.store_string("Olá, meu nome é Franco!\n")
    file.close()

    file.open(file_path, File.READ)
    assert(file.is_open())
    var contents = file.get_as_text()
    file.close()

    contents += "Tchau!\n"

    file.open(file_path + ".TMP", File.WRITE)
    assert(file.is_open())
    file.store_string(contents)
    file.close()

    var directory = Directory.new()
    directory.rename(file_path, file_path + ".BAK")
    # Alternatively, to delete the original file:
    # directory.remove(file_path)

    directory.rename(file_path + ".TMP", file_path)

    print("File created and modified successfully.")

To make the example simpler, the implementation uses assertions (assert()) instead of handling errors.

The implementation is simple. The file is created with some content (that will be modified) and closed. The created file is opened again; the content is read; the file is closed. A new file is created to store the modified content; to make it different from the original one, the .TMP extension was appended to the end of the original file, which is commonly used for temporary files. The extension could use lowercase letters; the use of uppercase letters were chosen to the extension.

After the new file has been created, one can rename or delete the original file. In the example, the default implementation renames the original file adding a .BAK suffix, which is commonly used for back-up. The commented line can delete the original file. If it is used, the file must not be renamed (otherwise, the value of the parameter must be updated.). It is important noticing that the old file is deleted, not the new one. If the original file is deleted only after generating the new one, the old file can be restored if the creation of the new file fails. Furthermore, files deleted with programming APIs are usually permanently deleted, which means they are not sent to the operating system's trash. Therefore, one should be careful when deleting files in programs.

In Python, the os module abstracts operating system operations. The subroutine os.rename() (documentation) allows renaming files; os.remove() (documentation) allows deleting files. In Lua, os.rename() (documentation) allows renaming files; os.remove() (documentation) allows deleting files. In GDScript, the class Directory provides the method rename() (documentation) to rename files, and remove() (documentation) to delete files.

Once again, one should be careful when deleting files. Otherwise, data loss may happen. Furthermore, although it is (potentially) possible to erase any file that the program has permission to read or write, it is not an ethical behavior to delete files that have not been created by the own program (except if the created program has the purpose of deleting files; still, it is polite asking for authorization before any removals or renaming).

Besides, some APIs can provide specific subroutines to create temporary files. For instance, Python provides the module tempfile (documentation) to create temporary files and directories. Lua provides io.tmpfile() (documentation) to create a file that is automatically deleted once the program ends. The language also provides os.tmpname() (documentation) to generate names for temporary files.

File Size

The seek operation allows changing the position in the file. Both in text files and binary files, it is possible to skip to the end of file or return to its beginning. The complementary operation of a seek is normally called tell, which provides the current position of the cursor in the file. If one finds the initial and the end position of a file, she/he can subtract the end value by the beginning one to discover the file size in bytes.

import io

file_path = "franco-garcia-size_file.txt"
file = open(file_path, "w")
assert(file)
file.write("Olá, meu nome é Franco!\n")
file.close()

file = open(file_path, "r")
assert(file)

start_position = file.tell()
file.seek(0, io.SEEK_END) # ou 2
end_position = file.tell()

file.close()

size = end_position - start_position
print("Fize size: ", size, " bytes.")
local file_path = "franco-garcia-size_file.txt"
local file = io.open(file_path, "w")
assert(file)
file:write("Olá, meu nome é Franco!\n")
io.close(file)

file = io.open(file_path, "r")
assert(file)

start_position = file:seek()
file:seek("end")
end_position = file:seek()

io.close(file)

local size = end_position - start_position
print("Fize size: " .. size .. " bytes.")
extends Node

func _ready():
    var file_path = "franco-garcia-size_file.txt"
    var file = File.new()
    file.open(file_path, File.WRITE)
    assert(file.is_open())
    file.store_string("Olá, meu nome é Franco!\n")
    file.close()

    file.open(file_path, File.READ)
    assert(file.is_open())

    var start_position = file.get_position()
    file.seek_end()
    var end_position = file.get_position()

    file.close()

    var size = end_position - start_position
    printt("Fize size: ", size, " bytes.")

In Python, tell() (documentation) provides the value of the current position in the file. In Lua, seek() without parameters can provide the value (documentation). In GDScript, this is provided by get_position() (documentation).

However, some implementation of file subroutines can return an incorrect size. Thus, if possible, it can be preferable to use a file system API to request the real size of a file.

Serialization (or Marshalling) and Deserialization (or Unmarshalling)

When one works if file or network transmission, terms such as serialization or marshalling are used to refer to storing data in a way that it can be recovered later. The restored data should be identical to the original ones, as well as the types used to store them. A common application is serialization data stored as objects of classes or variables of record types.

A popular approach to serialization using files consists of using JSON. To illustrate the practice, the next section provides an example.

Examples

In problems involving files, the file input and output operations are often the simplest part of the solution. Data from the files is loaded to the primary memory; data from primary memory is saved to the file. The remainder of the solution comprehends processing strings, type conversions, initializing data in records...

As the examples provided over this topic have illustrated how to use the important subroutines to manipulate files, the examples in this section will be more practical and different, though potentially more complex.

With the concepts that have been explored previously and files, it is possible to start exploring more interesting topics and problems. For instance, the possibility of generating files to use on other programs allow creating multimedia content. Thus, one can start investigating simple formats for image and sounds. The simple formats can be reproduced using specific programs (as media players) or converted to more common formats. This can be done programatically or using conversion tools.

Therefore, the examples demonstrate possible applications for the programming knowledge that has been acquired hitherto. In a certain way, they are the closure of a basic journey of learning programming, marking a transition from a beginner to someone who has an adequate programming knowledge. Adequate in the sense that the bases and fundamentals are suitable to program many complex systems, though there is still a lack of experience and the required practice to reach competence and proficiency. In other words, there is potential; therefore, it is time to materialize potential into systems.

For didactic purposes and to make the code easier to read and understand, the implementations are not optimized. The lack of optimizations can be noticed in some examples, especially if the value for some parameters are increased. For instance, some implementations use text file and strings, when ideally they should use binary files and numeric types. Yet, even opting for the simplicity, the examples can be complex for beginners (or even for professionals; it is not rare to find professionals who have not created an image or sound programatically, especially without using libraries). On the other hand, at the end of the examples, you will understand better how a computer works.

Furthermore, some sections list and describe programs to use with the created files. The simplicity for implementing a format can result into greater complexity to use it. Multimedia files without headers require manual configuration to be used. In general, one can install only the recommended software, use an online alternative, or simple follow the material.

Alternatively, the author has provided online applications to display/play uncommon formats in the Tools page. If you choose to use them, you can ignore the subsections describing specialized tools, though they may still be useful to read.

Serialization and Deserialization with JSON Files

The program for recipes defined in Records can serve as a good application to illustrate how to perform serialization using JSON. To simplify the code blocks used as example, only the definition of each record, and the code for serialization and deserialization are provided. The remainder of the original code can be restored to define a complete program. One can even add new options to the original menu to save and load files.

The Lua version requires using an external library for JSON manipulation. For simplicity, the library json.lua has been chosen, as it provides subroutines to manipulate JSON in a single file. Details on how to use it will be provided before the documentation of subroutines that were called from the library.

Simple Example: Ingredient Record

To start with a simple example, JSON can be used as a format for content, without files. The Ingredient record can be a good choice to understand how JSON works.

class Ingredient {
    constructor(name = "", quantity = 0.0, measure_unit = "") {
        this.name = name
        this.quantity = quantity
        this.measure_unit = measure_unit
    }
}

let original = new Ingredient("Water", 3.0, "Cups")
// Serialization.
let json_text = JSON.stringify(original)
console.log(json_text)

// Deserialization.
let json_object = JSON.parse(json_text)
console.log(json_object)
let recovered = new Ingredient
recovered.name = json_object["name"]
recovered.quantity = json_object.quantity
recovered.measure_unit = json_object.measure_unit
console.log(recovered)

// If Ingredient was a JavaScript Object used as a record
// (for instance, without subroutines in variables):
original = {
    "name": "Water",
    "quantity": 3.0,
    "measure_unit": "Cups"
}
json_text = JSON.stringify(original)
console.log(json_text)

recovered = JSON.parse(json_text)
console.log(recovered)
import json

class Ingredient:
    def __init__(self, name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

original = Ingredient("Water", 3.0, "Cups")
# Serialization.
json_text = json.dumps({
    "name": original.name,
    "quantity": original.quantity,
    "measure_unit": original.measure_unit
})
print(json_text)

# Deserialization.
json_object = json.loads(json_text)
print(json_object)
recovered = Ingredient()
recovered.name = json_object["name"]
recovered.quantity = json_object["quantity"]
recovered.measure_unit = json_object["measure_unit"]
print(recovered.name, recovered.quantity, recovered.measure_unit)
-- <https://github.com/rxi/json.lua>
local json = require "json"

function new_ingredient(name, quantity, measure_unit)
    local result = {
        name = name or "",
        quantity = quantity or 0.0,
        measure_unit = measure_unit or ""
    }

    return result
end

local original = new_ingredient("Water", 3.0, "Cups")
-- Serialization.
local json_text = json.encode(original)
print(json_text)

-- Deserialization.
local json_object = json.decode(json_text)
print(json_object.name, json_object.quantity, json_object.measure_unit)
local recovered = new_ingredient()
recovered.name = json_object["name"]
recovered.quantity = json_object["quantity"]
recovered.measure_unit = json_object["measure_unit"]
print(recovered.name, recovered.quantity, recovered.measure_unit)
extends Node

class Ingredient:
    var name
    var quantity
    var measure_unit

    func _init(name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

func _ready():
    var original = Ingredient.new("Water", 3.0, "Cups")
    # Serialization.
    var json_text = to_json({
        "name": original.name,
        "quantity": original.quantity,
        "measure_unit": original.measure_unit
    })
    print(json_text)

    # Deserialization.
    var json_object = parse_json(json_text)
    print(json_object)
    var recovered = Ingredient.new()
    recovered.name = json_object["name"]
    recovered.quantity = json_object["quantity"]
    recovered.measure_unit = json_object["measure_unit"]
    printt(recovered.name, recovered.quantity, recovered.measure_unit)

One can create JSON content without a file (although it is possibly not very useful). JSON defines the format of the content to be stored in a text file; it is not a different file type. The JSON content is stored in a text file. Thus, to create a JSON file with the generated content, it suffices to save json_text in a text file (for instance, my_ingredient.json). The resulting file would have the following contents:

{"name":"Water","quantity":3.0,"measure_unit":"Cups"}

The 3.0 can be stored as 3. Furthermore, the order of the key/value pairs can be different; like in a dictionary or hash table, it is not relevant. Some implementations sort the keys alphabetically, others follow the order of the insertion of values, others use random order. To keep values in order in a JSON file in a portable way, it is necessary to use an array.

Like programming languages such as JavaScript and Lua, the line breaks and spacing can be omitted of a JSON file. This reduces the file size to share and makes processing the contents faster, though it makes it harder to read by human beings. To make it easier to read for people, one can format the file with suitable indentation. There are tools for this purpose, including online ones (for instance, one can search for "json pretty print", "json beautifier" or "json formatter"). Some text editors for programming also have features to format files.

{
  "name": "Water",
  "quantity": 3.0,
  "measure_unit": "Cups"
}

As the name suggests, the contents of a JSON file is a valid JavaScript Object. Thus, one can copy and paste any of the previous blocks in the console of a browser to view the file contents in a structured way. It is also possible to open JSON files in the browser, for a structured view.

Working with JSON is similar to working with dictionaries and arrays in programming languages. In fact, many JSON APIs abstract the use of the format with dictionaries or arrays of dictionaries.

In the case of this first example, one can work the created JSON content as if she/he was working with a dictionary. Each dictionary correspond to the type defined for the Ingredient record encoded as a dictionary.

It is simple to create JSON content in JavaScript. The subroutines have been even used previously to copy arrays and dictionaries. To convert JavaScript data to a JSON string, one can use JSON.stringify() (documentation). To convert a JSON string to a JSON object, one can use JSON.parse() (documentation).

In the other languages, the process is similar. Python and GDScript provide subroutines to process JSON in the standard library. Lua requires an external library, used as dependency. One can also implement her/his own implementation to process JSON in Lua.

  • Python:

  • Lua: the solution uses the library json.lua, available at this repository.

    To used it, you must copy or store the file json.lua in the same directory of the code in the example. The library is loaded in local json = require "json".

  • GDScript:

After the basics about the format, it is time to consider using JSON with files to store and recover data. To do this, one can define one or multiple programs. For instance, one program to create a file, another to read the created file. Alternatively, she/he could create a single program to create and read the file. The examples use two programs, to split the tasks.

Saving a JSON File

The next code snippets illustrate how to save the generated json_text in a text file.

class Ingredient {
    constructor(name = "", quantity = 0.0, measure_unit = "") {
        this.name = name
        this.quantity = quantity
        this.measure_unit = measure_unit
    }
}

let file_path = "franco-garcia-ingredient.json"

let original = new Ingredient("Water", 3.0, "Cups")
// Serialization.
let json_text = JSON.stringify(original)

let file = new File([json_text], file_path, {type: "application/json"})
console.log("File created successfully.")

let download_link = document.createElement("a")
download_link.target = "_blank"
download_link.href = URL.createObjectURL(file)
download_link.download = file.name
if (confirm("Do you want to download the file '" + file.name + "'?")) {
    download_link.click()
    // In this case, revokeObjectURL() can be used both for confirmation
    // and for cancelling.
    URL.revokeObjectURL(download_link.href)
}
import io
import json
import sys

class Ingredient:
    def __init__(self, name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

file_path = "franco-garcia-ingredient.json"

original = Ingredient("Water", 3.0, "Cups")
# Serialization.
json_text = json.dumps({
    "name": original.name,
    "quantity": original.quantity,
    "measure_unit": original.measure_unit
})

try:
    file = open(file_path, "w")
    file.write(json_text)
    file.close()

    print("File created successfully.")
except IOError as exception:
    print("Error when trying to create the text file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to create the text file.", file=sys.stderr)
    print(exception)
-- <https://github.com/rxi/json.lua>
local json = require "json"

local EXIT_FAILURE = 1

function new_ingredient(name, quantity, measure_unit)
    local result = {
        name = name or "",
        quantity = quantity or 0.0,
        measure_unit = measure_unit or ""
    }

    return result
end

local file_path = "franco-garcia-ingredient.json"

local original = new_ingredient("Water", 3.0, "Cups")
-- Serialization.
local json_text = json.encode(original)

local file = io.open(file_path, "w")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to create the text file.")
    os.exit(EXIT_FAILURE)
end

file:write(json_text)
io.close(file)

print("File created successfully.")
extends Node

const EXIT_FAILURE = 1

class Ingredient:
    var name
    var quantity
    var measure_unit

    func _init(name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

func _ready():
    var file_path = "franco-garcia-ingredient.json"

    var original = Ingredient.new("Water", 3.0, "Cups")
    # Serialization.
    var json_text = to_json({
        "name": original.name,
        "quantity": original.quantity,
        "measure_unit": original.measure_unit
    })

    var file = File.new()
    if (file.open(file_path, File.WRITE) != OK):
        printerr("Error when trying to create the text file.")
        get_tree().quit(EXIT_FAILURE)

    file.store_line(json_text)
    file.close()

    print("File created successfully.")

The code to serialize the data is identical to the original example. The only difference is that the data is saved in a file. The file franco-garcia-ingredient.json can be opened in a text editor or browser.

Loading a JSON File

The next step consists of recovering the data from the file to perform the deserialization.

// This file must be saved in a file called "script.js".
// It will be processed by an HTML page with code to send the
// text file by a form.

class Ingredient {
    constructor(name = "", quantity = 0.0, measure_unit = "") {
        this.name = name
        this.quantity = quantity
        this.measure_unit = measure_unit
    }
}

function read_file(file_json) {
    if (!file_json) {
        return
    }

    if (!(file_json instanceof File)) {
        return
    }

    let file_reader = new FileReader()
    file_reader.onload = function(event) {
        let json_text = event.target.result

        // Deserialization.
        let json_object = JSON.parse(json_text)
        console.log(json_object)
        let recovered = new Ingredient
        recovered.name = json_object["name"]
        recovered.quantity = json_object.quantity
        recovered.measure_unit = json_object.measure_unit
        console.log(recovered)
    }
    file_reader.readAsText(file_json)

    return false
}
import io
import json
import sys

class Ingredient:
    def __init__(self, name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

file_path = "franco-garcia-ingredient.json"

try:
    file = open(file_path, "r")
    json_text = file.read()
    file.close()

    # Deserialization.
    json_object = json.loads(json_text)
    print(json_object)
    recovered = Ingredient()
    recovered.name = json_object["name"]
    recovered.quantity = json_object["quantity"]
    recovered.measure_unit = json_object["measure_unit"]
    print(recovered.name, recovered.quantity, recovered.measure_unit)
except IOError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to read the text file.", file=sys.stderr)
    print(exception)
-- <https://github.com/rxi/json.lua>
local json = require "json"

local EXIT_FAILURE = 1

function new_ingredient(name, quantity, measure_unit)
    local result = {
        name = name or "",
        quantity = quantity or 0.0,
        measure_unit = measure_unit or ""
    }

    return result
end

local file_path = "franco-garcia-ingredient.json"

local file = io.open(file_path, "r")
if (file == nil) then
    print(debug.traceback())

    print("Error when trying to read the text file.")
    os.exit(EXIT_FAILURE)
end

local json_text = file:read("*all")
io.close(file)

-- Deserialization.
local json_object = json.decode(json_text)
print(json_object.name, json_object.quantity, json_object.measure_unit)
local recovered = new_ingredient()
recovered.name = json_object["name"]
recovered.quantity = json_object["quantity"]
recovered.measure_unit = json_object["measure_unit"]
print(recovered.name, recovered.quantity, recovered.measure_unit)
extends Node

const EXIT_FAILURE = 1

class Ingredient:
    var name
    var quantity
    var measure_unit

    func _init(name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

func _ready():
    var file_path = "franco-garcia-ingredient.json"

    var file = File.new()
    if (file.open(file_path, File.READ) != OK):
        printerr("Error when trying to read the text file.")

    var json_text = file.get_as_text()
    file.close()

    # Deserialization.
    var json_object = parse_json(json_text)
    print(json_object)
    var recovered = Ingredient.new()
    recovered.name = json_object["name"]
    recovered.quantity = json_object["quantity"]
    recovered.measure_unit = json_object["measure_unit"]
    printt(recovered.name, recovered.quantity, recovered.measure_unit)

The JavaScript version requires an HTML page with a form to send the JSON file with the ingredient.

<!DOCTYPE html>
<html lang="pt-BR">

  <head>
    <meta charset="utf-8">
    <title>JSON File Read</title>
    <meta name="author" content="Franco Eusébio Garcia">
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>

  <body>
    <header>
      <h1>JSON File Read with Ingredient</h1>
    </header>

    <main>
      <!-- Form to send the text file. -->
      <form method="post"
            enctype="multipart/form-data"
            onsubmit="return read_file(files.files[0])">
        <label for="files">Choose a file with an ingredient:</label>
        <input id="files"
               name="files"
               type="file"
               accept="application/json"/>
        <input type="submit"/>
      </form>

      <div id="contents">
      </div>

      <!-- The name of the JavaScript file must match the one defined below. -->
      <script src="./script.js"></script>
    </main>
  </body>

</html>

As it was the case for serialization, the loading implements the deserialization part from the original example.

As one may notice, many programs can share files. This allows, among other applications, to exchange data among programs. A program can create a file, which can be modified by another program, the share it with a third program, the view it in a fourth program (potentially in another machine).

More Complex Example: Recipe Record

After the basic example, it is worth considering a more complex example, with arrays and records. To organize the code in smaller blocks, the example provides auxiliary subroutines.

// This file must be saved in a file called "script.js".
// It will be processed by an HTML page with code to send the
// text file by a form.

class Ingredient {
    constructor(name = "", quantity = 0.0, measure_unit = "") {
        this.name = name
        this.quantity = quantity
        this.measure_unit = measure_unit
    }
}

class Recipe {
    constructor(name = "", preparation_steps = "", ingredients = []) {
        this.name = name
        this.preparation_steps = preparation_steps
        this.ingredients = ingredients
    }
}

function ingredient_to_dictionary(ingredient) {
    let result = JSON.stringify(ingredient)

    return result
}

function dictionary_to_ingredient(ingredient_dictionary) {
    let result = new Ingredient()
    result.name = ingredient_dictionary["name"]
    result.quantity = ingredient_dictionary["quantity"]
    result.measure_unit = ingredient_dictionary["measure_unit"]

    return result
}

function recipe_to_dictionary(recipe) {
    let result = JSON.stringify(recipe)

    return result
}

function dictionary_to_recipe(recipe_dictionary) {
    let result = new Recipe()
    result.name = recipe_dictionary["name"]
    result.preparation_steps = recipe_dictionary["preparation_steps"]

    for (ingredient_dictionary of recipe_dictionary["ingredients"]) {
        let ingredient = dictionary_to_ingredient(ingredient_dictionary)
        result.ingredients.push(ingredient)
    }

    return result
}

function recipes_to_array_of_dictionaries(recipes) {
    let result = JSON.stringify(recipes)

    return result
}

function array_of_dictionaries_to_recipe(recipe_dictionarys) {
    let result = []
    for (recipe_dictionary of recipe_dictionarys) {
        let recipe = dictionary_to_recipe(recipe_dictionary)
        result.push(recipe)
    }

    return result
}

function save_recipes(recipes, file_path) {
    var file_contents = recipes_to_array_of_dictionaries(recipes)

    let file = new File([file_contents], file_path, {type: "application/json"})
    console.log("File created successfully.")

    let download_link = document.createElement("a")
    download_link.target = "_blank"
    download_link.href = URL.createObjectURL(file)
    download_link.download = file.name
    if (confirm("Do you want to download the file '" + file.name + "'?")) {
        download_link.click()
        // In this case, revokeObjectURL() can be used both for confirmation
        // and for cancelling.
        URL.revokeObjectURL(download_link.href)
    }
}

function load_recipes(text_file = null) {
    // NOTE If the load fails, the implementation uses default values.
    // This is performed to create an initial file with recipes.
    let result = [
        new Recipe("Bread",
                    "...",
                    [
                        new Ingredient("Water", 3.0, "Cups"),
                        new Ingredient("Flour", 4.0, "Cups"),
                        new Ingredient("Salt", 2.0, "Tablespoons"),
                        new Ingredient("Yeast", 2.0, "Teaspoons")
                    ]),
        new Recipe("Sweet Bread",
                    "...",
                    [
                        new Ingredient("Water", 3.0, "Cups"),
                        new Ingredient("Flour", 4.0, "Cups"),
                        new Ingredient("Sugar", 2.0, "Cups"),
                        new Ingredient("Salt", 2.0, "Tablespoons"),
                        new Ingredient("Yeast", 2.0, "Teaspoons")
                    ]),
    ]

    return result
}

function main() {
    let file_path = "franco-garcia-recipes.json"

    let recipes = load_recipes(/*file_path*/)
    save_recipes(recipes, file_path)
}

function read_file(file_json) {
    if (!file_json) {
        return
    }

    if (!(file_json instanceof File)) {
        return
    }

    let file_reader = new FileReader()
    file_reader.onload = function(event) {
        let contents = event.target.result
        let recipe_dictionarys = JSON.parse(contents)
        let result = array_of_dictionaries_to_recipe(recipe_dictionarys)

        console.log(result)
    }
    file_reader.readAsText(file_json)

    // Disallows the submission of the form, allowing the visualization of the
    // result of add_element() in the same page.
    return false
}

main()
import io
import json
import sys

class Ingredient:
    def __init__(self, name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

class Recipe:
    def __init__(self, name = "", preparation_steps = "", ingredients = None):
        self.name = name
        self.preparation_steps = preparation_steps
        self.ingredients = ingredients if (ingredients != None) else []

def ingredient_to_dictionary(ingredient):
    result = {
        "name": ingredient.name,
        "quantity": ingredient.quantity,
        "measure_unit": ingredient.measure_unit
    }

    return result

def dictionary_to_ingredient(ingredient_dictionary):
    result = Ingredient()
    result.name = ingredient_dictionary["name"]
    result.quantity = ingredient_dictionary["quantity"]
    result.measure_unit = ingredient_dictionary["measure_unit"]

    return result

def recipe_to_dictionary(recipe):
    result = {
        "name": recipe.name,
        "preparation_steps": recipe.preparation_steps,
        "ingredients": []
    }

    for ingredient in recipe.ingredients:
        result["ingredients"].append(ingredient_to_dictionary(ingredient))

    return result

def dictionary_to_recipe(recipe_dictionary):
    result = Recipe()
    result.name = recipe_dictionary["name"]
    result.preparation_steps = recipe_dictionary["preparation_steps"]

    for ingredient_dictionary in recipe_dictionary["ingredients"]:
        ingredient = dictionary_to_ingredient(ingredient_dictionary)
        result.ingredients.append(ingredient)

    return result

def recipes_to_array_of_dictionaries(recipes):
    result = []
    for recipe in recipes:
        result.append(recipe_to_dictionary(recipe))

    return result

def array_of_dictionaries_to_recipe(recipe_dictionarys):
    result = []
    for recipe_dictionary in recipe_dictionarys:
        recipe = dictionary_to_recipe(recipe_dictionary)
        result.append(recipe)

    return result

def save_recipes(recipes, file_path):
    file_contents = recipes_to_array_of_dictionaries(recipes)

    try:
        file = open(file_path, "w")
        file.write(json.dumps(file_contents))
        file.close()

        print("File created successfully.")
    except IOError as exception:
        print("Error when trying to create the text file.", file=sys.stderr)
        print(exception)
    except OSError as exception:
        print("Error when trying to create the text file.", file=sys.stderr)
        print(exception)

def load_recipes(file_path = None):
    if (file_path != None):
        try:
            file = open(file_path, "r")
            contents = file.read()
            file.close()

            recipe_dictionarys = json.loads(contents)
            result = array_of_dictionaries_to_recipe(recipe_dictionarys)

            return result
        except IOError as exception:
            print("Error when trying to read the text file.", file=sys.stderr)
            print(exception)
        except OSError as exception:
            print("Error when trying to read the text file.", file=sys.stderr)
            print(exception)

    # NOTE If the load fails, the implementation uses default values.
    # This is performed to create an initial file with recipes.
    result = [
        Recipe("Bread",
                "...",
                [
                    Ingredient("Water", 3.0, "Cups"),
                    Ingredient("Flour", 4.0, "Cups"),
                    Ingredient("Salt", 2.0, "Tablespoons"),
                    Ingredient("Yeast", 2.0, "Teaspoons")
                ]),
        Recipe("Sweet Bread",
                "...",
                [
                    Ingredient("Water", 3.0, "Cups"),
                    Ingredient("Flour", 4.0, "Cups"),
                    Ingredient("Sugar", 2.0, "Cups"),
                    Ingredient("Salt", 2.0, "Tablespoons"),
                    Ingredient("Yeast", 2.0, "Teaspoons")
                ]),
    ]

    return result

def main():
    file_path = "franco-garcia-recipes.json"

    recipes = load_recipes(file_path)
    save_recipes(recipes, file_path)

if (__name__ == "__main__"):
    main()
-- <https://github.com/rxi/json.lua>
local json = require "json"

local EXIT_FAILURE = 1

function new_ingredient(name, quantity, measure_unit)
    local result = {
        name = name or "",
        quantity = quantity or 0.0,
        measure_unit = measure_unit or ""
    }

    return result
end

function new_recipe(name, preparation_steps, ingredients)
    local result = {
        name = name or "",
        preparation_steps = quantity or "",
        ingredients = ingredients or {}
    }

    return result
end

function ingredient_to_dictionary(ingredient)
    local result = {
        name = ingredient.name,
        quantity = ingredient.quantity,
        measure_unit = ingredient.measure_unit
    }

    return result
end

function dictionary_to_ingredient(ingredient_dictionary)
    local result = new_ingredient()
    result.name = ingredient_dictionary["name"]
    result.quantity = ingredient_dictionary["quantity"]
    result.measure_unit = ingredient_dictionary["measure_unit"]

    return result
end

function recipe_to_dictionary(recipe)
    local result = {
        name = recipe.name,
        preparation_steps = recipe.preparation_steps,
        ingredients = {}
    }

    for _, ingredient in iparentrs(recipe.ingredients) do
        table.insert(result["ingredients"], ingredient_to_dictionary(ingredient))
    end

    return result
end

function dictionary_to_recipe(recipe_dictionary)
    local result = new_recipe()
    result.name = recipe_dictionary["name"]
    result.preparation_steps = recipe_dictionary["preparation_steps"]

    for _, ingredient_dictionary in iparentrs(recipe_dictionary["ingredients"]) do
        local ingredient = dictionary_to_ingredient(ingredient_dictionary)
        table.insert(result.ingredients, ingredient)
    end

    return result
end

function recipes_to_array_of_dictionaries(recipes)
    local result = {}
    for _, recipe in iparentrs(recipes) do
        table.insert(result, recipe_to_dictionary(recipe))
    end

    return result
end

function array_of_dictionaries_to_recipe(recipe_dictionarys)
    local result = {}
    for _, recipe_dictionary in parentrs(recipe_dictionarys) do
        local recipe = dictionary_to_recipe(recipe_dictionary)
        table.insert(result, recipe)
    end

    return result
end

function save_recipes(recipes, file_path)
    local file_contents = recipes_to_array_of_dictionaries(recipes)

    local file = io.open(file_path, "w")
    if (file == nil) then
        print(debug.traceback())

        error("Error when trying to create the text file.")
        os.exit(EXIT_FAILURE)
    end

    file:write(json.encode(file_contents))
    io.close(file)

    print("File created successfully.")
end

function load_recipes(file_path)
    file_path = file_path or nil
    if (file_path ~= nil) then
        local file = io.open(file_path, "r")
        if (file == nil) then
            print(debug.traceback())

            print("Error when trying to read the text file.")
            -- os.exit(EXIT_FAILURE)
        else
            local contents = file:read("*all")
            io.close(file)

            local recipe_dictionarys = json.decode(contents)
            local result = array_of_dictionaries_to_recipe(recipe_dictionarys)

            return result
        end
    end

    -- NOTE If the load fails, the implementation uses default values.
    -- This is performed to create an initial file with recipes.
    local result = {
        new_recipe("Bread",
                     "...",
                     {
                         new_ingredient("Water", 3.0, "Cups"),
                         new_ingredient("Flour", 4.0, "Cups"),
                         new_ingredient("Salt", 2.0, "Tablespoons"),
                         new_ingredient("Yeast", 2.0, "Teaspoons")
                     }),
        new_recipe("Sweet Bread",
                     "...",
                     {
                         new_ingredient("Water", 3.0, "Cups"),
                         new_ingredient("Flour", 4.0, "Cups"),
                         new_ingredient("Sugar", 2.0, "Cups"),
                         new_ingredient("Salt", 2.0, "Tablespoons"),
                         new_ingredient("Yeast", 2.0, "Teaspoons")
                     }),
    }

    return result
end

function main()
    local file_path = "franco-garcia-recipes.json"

    local recipes = load_recipes(file_path)
    save_recipes(recipes, file_path)
end

main()
extends Node

const EXIT_FAILURE = 1

class Ingredient:
    var name
    var quantity
    var measure_unit

    func _init(name = "", quantity = 0.0, measure_unit = ""):
        self.name = name
        self.quantity = quantity
        self.measure_unit = measure_unit

class Recipe:
    var name
    var preparation_steps
    var ingredients

    func _init(name = "", preparation_steps = "", ingredients = []):
        self.name = name
        self.preparation_steps = preparation_steps
        self.ingredients = ingredients

func ingredient_to_dictionary(ingredient):
    var result = {
        "name": ingredient.name,
        "quantity": ingredient.quantity,
        "measure_unit": ingredient.measure_unit
    }

    return result

func dictionary_to_ingredient(ingredient_dictionary):
    var result = Ingredient.new()
    result.name = ingredient_dictionary["name"]
    result.quantity = ingredient_dictionary["quantity"]
    result.measure_unit = ingredient_dictionary["measure_unit"]

    return result

func recipe_to_dictionary(recipe):
    var result = {
        "name": recipe.name,
        "preparation_steps": recipe.preparation_steps,
        "ingredients": []
    }

    for ingredient in recipe.ingredients:
        result["ingredients"].append(ingredient_to_dictionary(ingredient))

    return result

func dictionary_to_recipe(recipe_dictionary):
    var result = Recipe.new()
    result.name = recipe_dictionary["name"]
    result.preparation_steps = recipe_dictionary["preparation_steps"]

    for ingredient_dictionary in recipe_dictionary["ingredients"]:
        var ingredient = dictionary_to_ingredient(ingredient_dictionary)
        result.ingredients.append(ingredient)

    return result

func recipes_to_array_of_dictionaries(recipes):
    var result = []
    for recipe in recipes:
        result.append(recipe_to_dictionary(recipe))

    return result

func array_of_dictionaries_to_recipe(recipe_dictionarys):
    var result = []
    for recipe_dictionary in recipe_dictionarys:
        var recipe = dictionary_to_recipe(recipe_dictionary)
        result.append(recipe)

    return result

func save_recipes(recipes, file_path):
    var file_contents = recipes_to_array_of_dictionaries(recipes)

    var file = File.new()
    if (file.open(file_path, File.WRITE) != OK):
        printerr("Error when trying to create the text file.")
        get_tree().quit(EXIT_FAILURE)

    file.store_line(to_json(file_contents))
    file.close()

    print("File created successfully.")

func load_recipes(file_path = null):
    if (file_path != null):
        var file = File.new()
        if (file.open(file_path, File.READ) != OK):
            printerr("Error when trying to read the text file.")
        else:
            var contents = file.get_as_text()
            file.close()

            var recipe_dictionarys = parse_json(contents)
            var result = array_of_dictionaries_to_recipe(recipe_dictionarys)

            return result

    # NOTE If the load fails, the implementation uses default values.
    # This is performed to create an initial file with recipes.
    var result = [
        Recipe.new("Bread",
                    "...",
                    [
                        Ingredient.new("Water", 3.0, "Cups"),
                        Ingredient.new("Flour", 4.0, "Cups"),
                        Ingredient.new("Salt", 2.0, "Tablespoons"),
                        Ingredient.new("Yeast", 2.0, "Teaspoons")
                    ]),
        Recipe.new("Sweet Bread",
                    "...",
                    [
                        Ingredient.new("Water", 3.0, "Cups"),
                        Ingredient.new("Flour", 4.0, "Cups"),
                        Ingredient.new("Sugar", 2.0, "Cups"),
                        Ingredient.new("Salt", 2.0, "Tablespoons"),
                        Ingredient.new("Yeast", 2.0, "Teaspoons")
                    ]),
    ]

    return result

func _ready():
    var file_path = "franco-garcia-recipes.json"

    var recipes = load_recipes(file_path)
    save_recipes(recipes, file_path)

The versions in Python, GDScript and Lua are simpler to understand, as it is possible to use files without a from in the browser. The JavaScript is slightly different from the others, as it requires using an HTML page to send the file. The next example provides a simple HTML page with a form to send the file. To make it easier to use the program (although proving awful usability), a JSON file will be created with recipes and offered as a download. The downloaded JSON file should be sent to the HTML page. After sending the files, the content is displayed in the console. For better presentation, one could use the remainder of the original code to convert data to strings, and show the string in the browser.

<!DOCTYPE html>
<html lang="pt-BR">

  <head>
    <meta charset="utf-8">
    <title>Create and Read JSON File</title>
    <meta name="author" content="Franco Eusébio Garcia">
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>

  <body>
    <header>
      <h1>Read JSON File with Recipes</h1>
    </header>

    <main>
      <!-- Form to send the text file. -->
      <form method="post"
            enctype="multipart/form-data"
            onsubmit="return read_file(files.files[0])">
        <label for="files">Choose a file with recipes:</label>
        <input id="files"
               name="files"
               type="file"
               accept="application/json"/>
        <input type="submit"/>
      </form>

      <div id="contents">
      </div>

      <!-- The name of the JavaScript file must match the one defined below. -->
      <script src="./script.js"></script>
    </main>
  </body>

</html>

All programs will generate the file franco-garcia-recipes.json with two predefined recipes. The resulting file probably will have a single line, without spaces between values. To make it easier to read it, the following code block also provides the file formatted with indentations.

[{"name":"Bread","preparation_steps":"...","ingredients":[{"name":"Water","quantity":3,"measure_unit":"Cups"},{"name":"Flour","quantity":4,"measure_unit":"Cups"},{"name":"Salt","quantity":2,"measure_unit":"Tablespoons"},{"name":"Yeast","quantity":2,"measure_unit":"Teaspoons"}]},{"name":"Sweet Bread","preparation_steps":"...","ingredients":[{"name":"Water","quantity":3,"measure_unit":"Cups"},{"name":"Flour","quantity":4,"measure_unit":"Cups"},{"name":"Sugar","quantity":2,"measure_unit":"Cups"},{"name":"Salt","quantity":2,"measure_unit":"Tablespoons"},{"name":"Yeast","quantity":2,"measure_unit":"Teaspoons"}]}]
[
  {
    "name": "Bread",
    "preparation_steps": "...",
    "ingredients": [
      {
        "name": "Water",
        "quantity": 3,
        "measure_unit": "Cups"
      },
      {
        "name": "Flour",
        "quantity": 4,
        "measure_unit": "Cups"
      },
      {
        "name": "Salt",
        "quantity": 2,
        "measure_unit": "Tablespoons"
      },
      {
        "name": "Yeast",
        "quantity": 2,
        "measure_unit": "Teaspoons"
      }
    ]
  },
  {
    "name": "Sweet Bread",
    "preparation_steps": "...",
    "ingredients": [
      {
        "name": "Water",
        "quantity": 3,
        "measure_unit": "Cups"
      },
      {
        "name": "Flour",
        "quantity": 4,
        "measure_unit": "Cups"
      },
      {
        "name": "Sugar",
        "quantity": 2,
        "measure_unit": "Cups"
      },
      {
        "name": "Salt",
        "quantity": 2,
        "measure_unit": "Tablespoons"
      },
      {
        "name": "Yeast",
        "quantity": 2,
        "measure_unit": "Teaspoons"
      }
    ]
  }
]

In this second example, operations with the content of JSON file are similar to manipulating an array of dictionaries. Each dictionary corresponds to the type defined for the Recipe record encoded as a dictionary. Similarly, to create the JSON file, a dictionary is created with the desired content (or an array with dictionaries storing the content).

In Python, Lua and GDScript, an array of dictionaries is created to covert data to JSON (in other words, to perform the serialization). Strictly speaking, the Lua implementation could skip creating a table, as the data was already in a table. The data of each record is converted to a dictionary to prepare the data for serialization. Next, all the data is grouped into an array. The serialization converts the data from the array to a JSON string which, in the sequence, is stored in a text file.

The deserialization is the reverse process. The contents of the JSON file are read as a string, which is, next, converted into an array of dictionaries to abstract the JSON. To restore the original records, each dictionary is used to initialize the values of new variables for the suitable types. The iteration for each value creates the recipes array.

An interesting trivia about the programs is that, as they all use JSON to store and load recipes, one can share the recipe files among implementations using different programming languages. Provided that the encoding is the same (for instance, UTF-8), the programs will work correctly. With proper care, it is also possible to include or remove recipes (and/or ingredients) using a text editor to modify the file.

In other words, besides using JSON, this example acts as a model of using files to exchange data among different programs. Provide that they all follow a same format (in this case, the format specified for the array of recipes in JSON), it is possible to create a very same file in different programs, regardless of the programming language used to write the program. For instance, one could propose the extension .recipes to use with the four previous programs. All could generate and load data using the proposed format, which is represented using JSON.

By the way, one can open files in text editors, hexadecimal editors and file compressors (such as for the zip format) to search for clues of what the file stores. If the clue suggests a known file type (or a simple one), it is possible to edit the file externally, without the original program. It is even possible to create a new program to manipulate it (instead of using the original program).

Copying Files

File managers allow copying and pasting files to generate a copy that is identical to the original file. This can also be done using files in a programming language. In fact, the implementation is quite simple: it suffices to read each byte from a file and write them in a second file.

For this example, you should create a file named franco-garcia-original_file.txt with any content (preferably not empty). For instance, the file can store the classic Olá, meu nome Franco!. The file can be of any type; the example uses a text file because it is easier to read the results in the copy in a text editor. The example also serves to demonstrate that one can open a text file in binary mode (because the text file is a binary file). However, you can choose a file of any type and with any content, such as an image, a video, an executable file of a program, or the source code file used to write the program itself (although some text editors or IDEs may block the file being edited). Evidently, choose a file that has a back-up.

Both the path to the original file (file_path) and to the created copy (copy_path) can be changed. Thus, choose a name for the copy that does not overwrite the existing file.

The versions in Python, Lua and GDScript copy the original file byte by byte. The version in JavaScript copies the whole file at once, as all bytes were read in readAsArrayBuffer(). In the language and using a browser, as the file must be generated at once, it is not possible to save the data byte by byte. For educational purposes, one could copy the byte array index by index into a second array. However, it would be a copy of the first. In other words, she/he would waster processor cycles and memory without additional benefit.

Warning. In this example, other than file_path, another file will be generated in the path defined in copy_path, with the value franco-garcia_created_copy.txt. Take the proper care to avoid overwriting an existing file.

// This file must be saved in a file called "script.js".
// It will be processed by an HTML page with code to send the
// text file by a form.

function copy_file(original) {
    if (!original) {
        return
    } else if (!(original instanceof File)) {
        return
    }

    let file_reader = new FileReader()
    file_reader.onload = function(event) {
        let copy_path = "franco-garcia-created_copy.txt"

        // bytes already represents the contents of the original file, thus
        // it suffices to write the value.
        let bytes = new Uint8Array(event.target.result)
        let data = new Blob([bytes], {type: "application/octet-stream"})
        let file_copy = new File([data], copy_path, {type: data.type})
        console.log("File copied successfully.")

        let download_link = document.createElement("a")
        download_link.target = "_blank"
        download_link.href = URL.createObjectURL(file_copy)
        download_link.download = file_copy.name
        if (confirm("Do you want to download the file '" + file_copy.name + "'?")) {
            download_link.click()
            // In this case, revokeObjectURL() can be used both for confirmation
            // and for cancelling.
            URL.revokeObjectURL(download_link.href)
        }
    }
    file_reader.readAsArrayBuffer(original)

    // Disallows the submission of the form, allowing the visualization of the
    // result of add_element() in the same page.
    return false
}
import io
import struct
import sys

try:
    file_path = "franco-garcia-original_file.txt"
    copy_path = "franco-garcia-created_copy.txt"

    original = open(file_path, "rb")
    file_copy = open(copy_path, "wb")

    read_byte = original.read(1)
    while (read_byte):
        file_copy.write(read_byte)
        read_byte = original.read(1)

    original.close()
    file_copy.close()

    print("File copied successfully.")
except IOError as exception:
    print("Error when trying to create the binary file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to create the binary file.", file=sys.stderr)
    print(exception)
local EXIT_FAILURE = 1

local file_path = "franco-garcia-original_file.txt"
local copy_path = "franco-garcia-created_copy.txt"

local original = io.open(file_path, "rb")
local file_copy = io.open(copy_path, "wb")
if ((original == nil) or (file_copy == nil)) then
    print(debug.traceback())

    error("Error when trying to create the binary file.")
    os.exit(EXIT_FAILURE)
end

local read_byte = original:read(1)
while (read_byte) do
    file_copy:write(read_byte)
    read_byte = original:read(1)
end

io.close(original)
io.close(file_copy)

print("File copied successfully.")
extends Node

const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-original_file.txt"
    var copy_path = "franco-garcia-created_copy.txt"

    var original = File.new()
    var file_copy = File.new()
    if ((original.open(file_path, File.READ) != OK) or
        (file_copy.open(copy_path, File.WRITE) != OK)):
        printerr("Error when trying to create the binary file.")
        get_tree().quit(EXIT_FAILURE)

    var read_byte = original.get_8()
    while (read_byte):
        file_copy.store_8(read_byte)
        read_byte = original.get_8()

    original.close()
    file_copy.close()

    print("File copied successfully.")

The JavaScript version requires an HTML file to send the original file. If you use one of the previous HTML page, you must note that, in onsubmit, read_file() was replaced by copy_file(), which is the name of the function defined in JavaScript.

<!DOCTYPE html>
<html lang="pt-BR">

  <head>
    <meta charset="utf-8">
    <title>File Copy</title>
    <meta name="author" content="Franco Eusébio Garcia">
    <meta name="viewport" content="width=device-width, initial-scale=1">
  </head>

  <body>
    <header>
      <h1>File Copy</h1>
    </header>

    <main>
      <!-- Form to send the text file. -->
      <form method="post"
            enctype="multipart/form-data"
            onsubmit="return copy_file(files.files[0])">
        <label for="files">Choose a file:</label>
        <input id="files"
               name="files"
               type="file"
               accept="application/octet-stream"/>
        <input type="submit"/>
      </form>

      <div id="contents">
      </div>

      <!-- The name of the JavaScript file must match the one defined below. -->
      <script src="./script.js"></script>
    </main>
  </body>

</html>

As it was done in JavaScript, in programming languages that allow operating with byte arrays or memory blocks, one can simply read the entire contents of the original file and store it at the target file. Evidently, this can depend on the size of the original file; the available quantity of primary memory must be enough to store all the bytes of the original file. For files that are larger than the available quantity of primary memory, a copy using smaller memory blocks (called chunks) is inevitable.

Image Files and Raster Images (Raster Graphics)

The implementation of Conway's Game of Life defined the first animation (in the terminal or console, though it is an animation) of this material. With files, it is now possible to create the first image. The image will not be shown in the program, though it will be provided as a file.

If you are using a screen reader, this subsection will be strange to hear. Many values are part of matrices; thus, the narration will include potentially long numerical sequences.

Image Visualization and Manipulation

Note. If you have chosen to use the online Tools, you will not have to install any program described in this subsection.

As the created images will be small, it can be helpful to magnify the resulting image (with zoom) or apply a scale operation for a better visualization. A subsection will discuss a simple way to magnify the image, though the result is suitable only for pixel art.

In some operating systems (such as Windows), the default image viewer does not read files with the formats that will be used in this section. To open the created images, there are programs such as GIMP (which is also a image editor), IrfanView, or XnView. These programs can be installed using Ninite, Chocolatey and Scoop. GIMP is open source, though IrfanView and XnView are not (though they are free of charge for personal use). For an alternative open source image viewer with a more modern look, one alternative is called ImageGlass (available for manual installation, or using Chocolatey or Scoop). Another open source option is Gwenview, from KDE. Gwenview will appear on illustrations displaying the created images in the next sections.

If you do not want to install a program, you can use Magick Online Studio. Choose a file in Browse... then press the button View to load the image into the service. Evidently, take proper care when sending any image (or any file) to an Internet service; some data should not leave your machine. Another option is using JqMagick, which also provided by ImageMagick. To send an image, choose the icon with a paper sheet with a green arrow inside a circle.

The software ImageMagick is an open source command line program to create, edit and convert images. The online version called Magick Online Studio provides a Web interface for some commands that are available in the command line. For instance, to increase the size of an image, one can choose Transform, then Resize, choose a new size (for instance, change from 23x11 to 230x110 to apply scale the image using a factor of ten), then press the button resize.

If ImageMagick is installed in your machine, the same operation can be performed using the command line, even to convert formats. ImageMagick provides the command convert to convert and edit images.

convert input.pbm -scale 1000% result.png

The previous command scale the size input.pbm by a ten times factor (1000%), converts the result to the PNG format and save the result in the file result.png.

ImageMagick can also show images using the command display.

display result.png

Both operations can be useful when they are combined using a command interpreter. When the program ends, the returned after the execution can inform whether the program was terminated successfully. The value 0 usually means success; the value 1, written as EXIT_FAILURE in some examples, means error in many systems. The values 0 and 1 can vary among platforms and operating systems.

Regardless, in a command line interpreter such as Bash or ZSH, the operator && can be used to run a next command if the previous one was successful. Thus, it is possible combining running the program with the commands convert and display to show the scaled image after using the code of a program.

python script.py &&
convert franco-garcia-franco_name.pbm -scale 1000% franco-garcia-franco_name_scaled.pbm &&
display franco-garcia-franco_name_scaled.pbm

The previous command runes the Python code stored in a file called script.py. If the execution is successful, the following line is run with the command convert to scale the image up. If the scaling is successful, the scaled image is shown using display. In other words, the command automated scaling and showing the image resulting from the Python program in a single command combining the three.

As it can be suggested by the example, the command line can serve as a powerful complement to programming languages. Instead of Python, one could use Lua, GDScript or any other programming language (that support use via command line) with other computer programs to process results as sequences of operations. Although it is more complicated doing this using JavaScript for browsers (for instance, using a browser in headless mode), it can be done using interpreters such as SpiderMonkey or Node.js.

In many programming languages, it is also possible to define command to run other programs and get their results in the code of a program, with an external call. Although this can be practical, the use of an external program can represent a vulnerability, with potential security risks. A safer way is using a file to share data among programs, or more advances features such as sockets or pipes for communication or passing results forward.

Portable Bitmap Format (PBM)

The graphical package Netpbm (or Pbmplus; official page) define three simple formats for images:

  • Portable Bitmap Format (PBM);
  • Portable Graymap Format (PGM);
  • Portable Pixmap Format (PPM).

The formats are not efficient regarding their size and performance (in the text versions); however, they are simple to understand and implement, serving as a good introduction to image files. The English Wikipedia page describe the formats. The Portuguese page provides an image of a diskette in the format PBM.

The specification of the PBM format is available at the documentation of the format. One can define images in the format using a binary file or a text file. The definition as a file text is an easiness of the format; images are usually defined as binary files, to reduce the size of the file and improve the performance to load data.

Although inefficient, the text version is useful for learning purposes -- after all, one can create and edit images with a text editor. For instance, the following snippet contains an image in text in the format PBM that draws a square with black borders and white center. If the content is saved in a text editor with the extension .pbm (for instance, square.pbm), the resulting file can be opened with an image viewer that is compatible with the format. It is also possible to use the Portable Bitmap Format (PBM) Image Viewer created by the author. A (very small) image will be generated representing the contents.

P1
# The outline of a square.
5 5
1 1 1 1 1
1 0 0 0 1
1 0 0 0 1
1 0 0 0 1
1 1 1 1 1

The lines are numbered to make it easier to explain the format. The numbers in the edge of the block are not part of the format; for convenience, they cannot be selected or copied in the browser.

In the format, the three initial lines define the header of the file, with metadata to help to interpret the contents. P1 is the code for the PBM format. The next line (# The outline of a square.) is a comment; it is optional. The third line is required: the first number (in the example, 5) is the number of columns in the image. The second number (in the example, 5) is the number of lines. The image is, thus, represented as a matrix.

The remaining lines define the pixels (picture elements) that define the image. The value 0 means that the pixel must have white color; the value 1 means black color for the pixel. With 5 row containing 5 columns each, the imagem has 25 black or white pixels. As it will be shown for the PPM format, colored images have multiple bytes per pixel, to define values for each color.

With suitable combinations of values for pixels, one can draw any image. For instance, images in the format Joint Photographic Experts Group (JPEG, with extensions .jpeg or .jpg) and Portable Network Graphics (PNG, with extension .png) also map pixels as numeric values in a matrix. This kind of image is called bitmap (or raster or matricial image). The higher the number of pixels, the higher will be the image resolution, as well as its size, potential quality, and detailing. High quality images often have millions of pixels. For instance, photographic cameras define the resolution in megapixels (MP), which means millions of pixels.

As the author is not an artist, this topic will explore another use for images. In particular, it is possible to create representations of alphabet letters, numbers and symbols to draw text into an image. When a program exhibits content in the screen of a computer, this is one of the required steps to convert data from strings to images to show. In other words, when text is shown in a monitor, it is processed by a code that is similar (though more sophisticated) with the one that will be implemented to build the image. For computer fonts, the process of conversion is called font rasterization, and high quality versions use vector images. However, instead of writing the created image in a file, the memory that stores the image is sent to the screen.

In this section, the examples use files for simplicity. For instance, the next snippet contains a text image in the PBM format to draw the string Franco Garcia. It will be used for the remaining examples.

P1
# Franco Garcia
23 11
1 1 1 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 0 1 0 0 1 1 0 0 0 1 1 0 1 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
1 0 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 1 1 1
1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 1

The image has 23 columns (the first value of the third line) and 11 lines (second value of the third line).

JavaScript, Python, Lua and GDScript allow defining strings with multiple lines using special syntax. The implementation uses that feature to write the values of the textual image in a string. In languages that do not provide such feature, one could add a \n for each line break.

let file_path = "franco-garcia-franco_name.pbm"
let contents = `P1
# Franco Garcia
23 11
1 1 1 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 0 1 0 0 1 1 0 0 0 1 1 0 1 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
1 0 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 1 1 1
1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 1
`

let file = new File([contents], file_path, {type: "image/x-portable-bitmap"})
console.log("Image created successfully.")

let download_link = document.createElement("a")
download_link.target = "_blank"
download_link.href = URL.createObjectURL(file)
download_link.download = file.name
if (confirm("Do you want to download the file '" + file.name + "'?")) {
    download_link.click()
    // In this case, revokeObjectURL() can be used both for confirmation
    // and for cancelling.
    URL.revokeObjectURL(download_link.href)
}
import io
import sys

try:
    file_path = "franco-garcia-franco_name.pbm"
    file = open(file_path, "w")

    file.write("""P1
# Franco Garcia
23 11
1 1 1 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 0 1 0 0 1 1 0 0 0 1 1 0 1 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
1 0 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 1 1 1
1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 1
""")
    file.close()

    print("Image created successfully.")
except IOError as exception:
    print("Error when trying to create the text file.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to create the text file.", file=sys.stderr)
    print(exception)
local EXIT_FAILURE = 1

local file_path = "franco-garcia-franco_name.pbm"
local file = io.open(file_path, "w")
if (file == nil) then
    print(debug.traceback())

    error("Error when trying to create the text file.")
    os.exit(EXIT_FAILURE)
end

file:write([[
P1
# Franco Garcia
23 11
1 1 1 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 0 1 0 0 1 1 0 0 0 1 1 0 1 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
1 0 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 1 1 1
1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 1
]])

io.close(file)

print("Image created successfully.")
extends Node

const EXIT_FAILURE = 1

func _ready():
    var file_path = "franco-garcia-franco_name.pbm"
    var file = File.new()
    if (file.open(file_path, File.WRITE) != OK):
        printerr("Error when trying to create the text file.")
        get_tree().quit(EXIT_FAILURE)

    file.store_string("""P1
# Franco Garcia
23 11
1 1 1 0 1 1 0 0 0 1 0 0 1 1 0 0 0 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 1 1 0 1 1 0 0 1 1 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 1 0 1
1 0 0 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 1 0 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 0 1 0 0 1 1 0 0 0 1 1 0 1 1 1 0 0 1 0
1 0 0 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
1 0 1 0 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 1 1 1
1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 1
0 1 1 0 1 0 1 0 1 0 1 0 0 1 1 0 1 1 1 0 1 0 1
""")

    file.close()

    print("Image created successfully.")

The resulting image will be small (23x11 pixels, which results in 253 pixels in total). In the following illustration, the resulting image is on top, and it is compared to a 1600% scaled up image below it, both viewed in the software Gwenview. The images are in real size (which can potentially exceed the maximum width for visualization in a mobile device).

Resulting PBM image shown in Gwenview with the text 'Franco Garcia' written in black over white background. The image in the top results from running the program, with dimensions 23x11 pixels. The image below in a magnification of 1600%.

If you use the Portable Bitmap Format (PBM) Image Viewer created by the author, it will show both the original image as well as a scaled version of it.

An example of how to magnify it in the implementation will be provided at an appropriate time. Before, however, it is worth providing patterns for other letters. After all, with abilities to draw in image files, you probably will not want to write the name of author. That is, except if your name is Franco Garcia as well.

Patterns for the Alphabet and Selected Characters

For more representations of letters, one can draw inspiration from pixel art fonts. For instance, the font DatCub, created by GGBot has been used as a model to write the previous phrase (the position of the 1values follow the font). The following examples provide images in the PBM format for all characters of the English language, digits and some punctuation symbols. To access all tabs, you will probably need to view this page in a desktop browser, because I have not implemented scroll in the menu for code blocks.

Uppercase letters:

P1
# A
3 5
0 1 0
1 0 1
1 1 1
1 0 1
1 0 1
P1
# W
3 5
1 1 0
1 0 1
1 1 0
1 0 1
1 1 0
P1
# C
3 5
0 1 1
1 0 0
1 0 0
1 0 0
0 1 1
P1
# D
3 5
1 1 0
1 0 1
1 0 1
1 0 1
1 1 0
P1
# E
3 5
1 1 1
1 0 0
1 1 1
1 0 0
1 1 1
P1
# F
3 5
1 1 1
1 0 0
1 1 1
1 0 0
1 0 0
P1
# G
3 5
0 1 1
1 0 0
1 0 1
1 0 1
0 1 1
P1
# H
3 5
1 0 1
1 0 1
1 1 1
1 0 1
1 0 1
P1
# I
3 5
1 1 1
0 1 0
0 1 0
0 1 0
1 1 1
P1
# J
3 5
0 1 1
0 0 1
0 0 1
1 0 1
0 1 0
P1
# K
3 5
1 0 1
1 0 1
1 1 0
1 0 1
1 0 1
P1
# L
3 5
1 0 0
1 0 0
1 0 0
1 0 0
1 1 1
P1
# M
3 5
1 0 1
1 1 1
1 0 1
1 0 1
1 0 1
P1
# N
3 5
1 1 0
1 0 1
1 0 1
1 0 1
1 0 1
P1
# O
3 5
0 1 0
1 0 1
1 0 1
1 0 1
0 1 0
P1
# B
3 5
1 1 0
1 0 1
1 1 0
1 0 0
1 0 0
P1
# Q
3 5
0 1 0
1 0 1
1 0 1
1 1 1
0 1 1
P1
# R
3 5
1 1 0
1 0 1
1 1 0
1 0 1
1 0 1
P1
# S
3 5
0 1 1
1 0 0
1 1 1
0 0 1
1 1 0
P1
# T
3 5
1 1 1
0 1 0
0 1 0
0 1 0
0 1 0
P1
# U
3 5
1 0 1
1 0 1
1 0 1
1 0 1
1 1 1
P1
# V
3 5
1 0 1
1 0 1
1 0 1
0 1 0
0 1 0
P1
# W
3 5
1 0 1
1 0 1
1 0 1
1 1 1
1 0 1
P1
# X
3 5
1 0 1
1 0 1
0 1 0
1 0 1
1 0 1
P1
# Y
3 5
1 0 1
1 0 1
0 1 0
0 1 0
0 1 0
P1
# Z
3 5
1 1 1
0 0 1
0 1 0
1 0 0
1 1 1

Numbers and symbols:

P1
# 0
3 5
1 1 1
1 0 1
1 0 1
1 0 1
1 1 1
P1
# 1
3 5
0 1 0
1 1 0
0 1 0
0 1 0
1 1 1
P1
# 2
3 5
0 1 1
1 0 1
0 1 0
1 0 0
1 1 1
P1
# 3
3 5
1 1 0
0 0 1
1 1 1
0 0 1
1 1 0
P1
# 4
3 5
1 0 1
1 0 1
1 1 1
0 0 1
0 0 1
P1
# 5
3 5
1 1 1
1 0 0
1 1 1
0 0 1
1 1 0
P1
# 6
3 5
0 1 1
1 0 0
1 1 1
1 0 1
1 1 1
P1
# 7
3 5
1 1 1
0 0 1
0 1 0
0 1 0
0 1 0
P1
# 8
3 5
1 1 1
1 0 1
0 1 0
1 0 1
1 1 1
P1
# 9
3 5
1 1 1
1 0 1
0 1 1
0 0 1
1 1 0
P1
# .
3 5
0 0 0
0 0 0
0 0 0
0 0 0
1 0 0
P1
# ;
3 5
0 0 0
0 0 0
0 0 0
0 1 0
0 1 0
P1
# ,
3 5
0 0 0
0 0 0
0 0 0
0 1 0
0 1 0
P1
# :
3 5
0 0 0
0 1 0
0 0 0
0 1 0
0 0 0
P1
# '
3 5
0 0 1
0 0 1
0 0 0
0 0 0
0 0 0
P1
# "
3 5
0 1 1
0 1 1
0 0 0
0 0 0
0 0 0
P1
# (
3 5
0 1 0
1 0 0
1 0 0
1 0 0
0 1 0
P1
# !
3 5
0 1 0
0 1 0
0 1 0
0 0 0
0 1 0
P1
# ?
3 5
1 1 1
0 0 1
0 1 0
0 0 0
0 1 0
P1
# )
3 5
0 1 0
0 0 1
0 0 1
0 0 1
0 1 0
P1
# +
3 5
0 0 0
0 1 0
1 1 1
0 1 0
0 0 0
P1
# -
3 5
0 0 0
0 0 0
1 1 1
0 0 0
0 0 0
P1
# *
3 5
0 1 0
1 0 1
0 1 0
0 0 0
0 0 0
P1
# /
3 5
0 0 1
0 0 1
0 1 0
1 0 0
1 0 0
P1
# =
3 5
0 0 0
1 1 1
0 0 0
1 1 1
0 0 0

With the previous codes, one can draw alphanumeric text as images. With a bit of creativity and programming, one can map each individual representation with the respective character to convert a string into images to represent them.

Portable Graymap Format (PGM)

The next step is introducing shades of gray to the image. The PGM format allows working with values for gray scale; it is specified in the documentation. To define the metadata, one uses P2 and an additional line, which defines the number of possible values for the gray scale. The value can have one or two bytes. Therefore, you can choose any value between 1 until 65535 (the zero is counted as a value). Older versions of the format limited the scale to one byte (which means that the maximum value was 255).

In the format PGM, the values for colors are inverted (when compared to PBM). 0 is black; the maximum defined value is white. Intermediate values define the gray scale. The closer to 0, the darker. The closer to the maximum value, the lighter.

The following example uses 255 as the maximum value for the gray scale. In other words, 256 shades can be used. To keep the values aligned, it uses 000 for black and 255 for white.

P2
# Franco Garcia
23 11
255
000 000 000 255 000 000 255 255 255 000 255 255 000 000 255 255 255 000 000 255 255 000 255
000 255 255 255 000 255 000 255 000 255 000 255 000 255 000 255 000 255 255 255 000 255 000
000 000 000 255 000 000 255 255 000 000 000 255 000 255 000 255 000 255 255 255 000 255 000
000 255 255 255 000 255 000 255 000 255 000 255 000 255 000 255 000 255 255 255 000 255 000
000 255 255 255 000 255 000 255 000 255 000 255 000 255 000 255 255 000 000 255 255 000 255
255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255 255
255 000 000 255 255 000 255 255 000 000 255 255 255 000 000 255 000 000 000 255 255 000 255
000 255 255 255 000 255 000 255 000 255 000 255 000 255 255 255 255 000 255 255 000 255 000
000 255 000 255 000 000 000 255 000 000 255 255 000 255 255 255 255 000 255 255 000 000 000
000 255 000 255 000 255 000 255 000 255 000 255 000 255 255 255 255 000 255 255 000 255 000
255 000 000 255 000 255 000 255 000 255 000 255 255 000 000 255 000 000 000 255 000 255 000

Naturally, using only black and white when one can use 254 shades is a waster. To start creating an image programatically, the image can be thought as a matrix of integer values. This way, the program can draw or define some logic in code to choose a shade.

function random_integer(inclusive_minimum, inclusive_maximum) {
    let minimum = Math.ceil(inclusive_minimum)
    let maximum = Math.floor(inclusive_maximum)

    return Math.floor(minimum + Math.random() * (maximum + 1 - minimum))
}

// White
const W = 255
// Black
const B = 0

let file_path = "franco-garcia-franco_name.pgm"

let data = [
    [B, B, B, W, B, B, W, W, W, B, W, W, B, B, W, W, W, B, B, W, W, B, W],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B],
    [B, B, B, W, B, B, W, W, B, B, B, W, B, W, B, W, B, W, W, W, B, W, B],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, W, B, B, W, W, B, W],
    [W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W],
    [W, B, B, W, W, B, W, W, B, B, W, W, W, B, B, W, B, B, B, W, W, B, W],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B],
    [B, W, B, W, B, B, B, W, B, B, W, W, B, W, W, W, W, B, W, W, B, B, B],
    [B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B],
    [W, B, B, W, B, W, B, W, B, W, B, W, W, B, B, W, B, B, B, W, B, W, B]
]
let rows = data.length
let columns = data[0].length

for (let row = 0; row < rows; ++row) {
    for (let column = 0; column < columns; ++column) {
        if (data[row][column] === B) {
            data[row][column] = random_integer(B, (row * columns + column) % W)
        }
    }
}

let contents = "P2\n# Franco Garcia\n"
contents += columns + " " + rows + "\n"
contents += W + "\n"
for (let row = 0; row < rows; ++row) {
    for (let column = 0; column < columns; ++column) {
        contents += data[row][column] + " "
    }

    contents += "\n"
}

let file = new File([contents], file_path, {type: "image/x-portable-graymap"})
console.log("Image created successfully.")

let download_link = document.createElement("a")
download_link.target = "_blank"
download_link.href = URL.createObjectURL(file)
download_link.download = file.name
if (confirm("Do you want to download the file '" + file.name + "'?")) {
    download_link.click()
    // In this case, revokeObjectURL() can be used both for confirmation
    // and for cancelling.
    URL.revokeObjectURL(download_link.href)
}
import io
import random
import sys

from typing import Final

# White
W: Final = 255
# Black
B: Final = 0

file_path = "franco-garcia-franco_name.pgm"

random.seed()

data = [
    [B, B, B, W, B, B, W, W, W, B, W, W, B, B, W, W, W, B, B, W, W, B, W],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B],
    [B, B, B, W, B, B, W, W, B, B, B, W, B, W, B, W, B, W, W, W, B, W, B],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, W, B, B, W, W, B, W],
    [W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W],
    [W, B, B, W, W, B, W, W, B, B, W, W, W, B, B, W, B, B, B, W, W, B, W],
    [B, W, W, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B],
    [B, W, B, W, B, B, B, W, B, B, W, W, B, W, W, W, W, B, W, W, B, B, B],
    [B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B],
    [W, B, B, W, B, W, B, W, B, W, B, W, W, B, B, W, B, B, B, W, B, W, B]
]
rows = len(data)
columns = len(data[0])

for row in range(rows):
    for column in range(columns):
        if (data[row][column] == B):
            data[row][column] = random.randint(B, (row * columns + column) % W)

contents = "P2\n# Franco Garcia\n"
contents += str(columns) + " " + str(rows) + "\n"
contents += str(W) + "\n"
for row in range(rows):
    for column in range(columns):
        contents += str(data[row][column]) + " "

    contents += "\n"

try:
    file = open(file_path, "w")

    file.write(contents)
    file.close()

    print("Image created successfully.")
except IOError as exception:
    print("Error when trying to create the image.", file=sys.stderr)
    print(exception)
except OSError as exception:
    print("Error when trying to create the image.", file=sys.stderr)
    print(exception)
local EXIT_FAILURE = 1

-- White
local W = 255
-- Black
local B = 0

local file_path = "franco-garcia-franco_name.pgm"

math.randomseed(os.time())

local data = {
    {B, B, B, W, B, B, W, W, W, B, W, W, B, B, W, W, W, B, B, W, W, B, W},
    {B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B},
    {B, B, B, W, B, B, W, W, B, B, B, W, B, W, B, W, B, W, W, W, B, W, B},
    {B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B},
    {B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, W, B, B, W, W, B, W},
    {W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W},
    {W, B, B, W, W, B, W, W, B, B, W, W, W, B, B, W, B, B, B, W, W, B, W},
    {B, W, W, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B},
    {B, W, B, W, B, B, B, W, B, B, W, W, B, W, W, W, W, B, W, W, B, B, B},
    {B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B},
    {W, B, B, W, B, W, B, W, B, W, B, W, W, B, B, W, B, B, B, W, B, W, B}
}
local rows = #data
local columns = #data[1]

for row = 1, rows do
    for column = 1, columns do
        if (data[row][column] == B) then
            data[row][column] = math.random(B, (row * columns + column) % W)
        end
    end
end

local contents = "P2\n# Franco Garcia\n"
contents = contents .. columns .. " " .. rows .. "\n"
contents = contents .. W .. "\n"
for row = 1, rows do
    for column = 1, columns do
        contents = contents .. data[row][column] .. " "

    contents = contents .. "\n"
    end
end

local file = io.open(file_path, "w")
if (file == nil) then
    print(debug.traceback())

    printerr("Error when trying to create the image.")
    os.exit(EXIT_FAILURE)
end

file:write(contents)
io.close(file)

print("Image created successfully.")
extends Node

const EXIT_FAILURE = 1

# White
const W = 255
# Black
const B = 0

func random_integer(inclusive_minimum, inclusive_maximum):
    var minimum = ceil(inclusive_minimum)
    var maximum = floor(inclusive_maximum)

    # randi(): [0.0, 1.0[
    return randi() % int(maximum + 1 - minimum) + minimum

func _ready():
    var file_path = "franco-garcia-franco_name.pgm"

    randomize()

    var data = [
        [B, B, B, W, B, B, W, W, W, B, W, W, B, B, W, W, W, B, B, W, W, B, W],
        [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B],
        [B, B, B, W, B, B, W, W, B, B, B, W, B, W, B, W, B, W, W, W, B, W, B],
        [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, B, W, B],
        [B, W, W, W, B, W, B, W, B, W, B, W, B, W, B, W, W, B, B, W, W, B, W],
        [W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W, W],
        [W, B, B, W, W, B, W, W, B, B, W, W, W, B, B, W, B, B, B, W, W, B, W],
        [B, W, W, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B],
        [B, W, B, W, B, B, B, W, B, B, W, W, B, W, W, W, W, B, W, W, B, B, B],
        [B, W, B, W, B, W, B, W, B, W, B, W, B, W, W, W, W, B, W, W, B, W, B],
        [W, B, B, W, B, W, B, W, B, W, B, W, W, B, B, W, B, B, B, W, B, W, B]
    ]
    var rows = len(data)
    var columns = len(data[0])

    for row in range(rows):
        for column in range(columns):
            if (data[row][column] == B):
                data[row][column] = random_integer(B, (row * columns + column) % W)

    var contents = "P2\n# Franco Garcia\n"
    contents += str(columns) + " " + str(rows) + "\n"
    contents += str(W) + "\n"
    for row in range(rows):
        for column in range(columns):
            contents += str(data[row][column]) + " "

        contents += "\n"

    var file = File.new()
    if (file.open(file_path, File.WRITE)<