Table Representation
Suppose that we want to create a database server in Python. The program can represent a database table as a Python dictionary. For example, consider the following table:
Given Name | Family Name | Weight (in kg) |
---|---|---|
Mary | Smith | 60 |
Emma | Lin | 51 |
Lucas | Smalls | 82 |
We can represent this table with the following dictionary:
profiles = { "columns":[ "given_name", "family_name", "weight" ], "rows": [ [ "Mary", "Smith", 60 ], [ "Emma", "Lin", 51 ], [ "Lucas", "Smalls", 82 ], ] }
The entry with key "columns" contains the column names, and the entry with key "rows" contains the table rows.
Let's see how we can query this table. First, we write a helper function that finds the index of a column:
def find_column(table, column): index = 0 for name in table['columns']: if column == name: return index index += 1 return -1
For example, find_column(profiles, 'weight') returns 2.
The following function queries the table:
def select(table, column, value): index = find_column(table, column) output = [] for row in table['rows']: if row[index] == value: output.append(row) return output
The select function iterates over the table rows. For each row, if the element at the column index is equal to value, then the row is appended to the output. For exmaple, select(profiles, "weight", 51) returns:
[ [ "Emma", "Lin", 51 ], ]
Please write the Python expression that represents a table row with the following values:
Matt | Lions | 2370 Park Boulevard |
Please write the Python expression that represents the following table:
given_name | family_name | city |
---|---|---|
Curtis | Conway | Portland |
Billy | Homes | Cleveland |
Matt | Lions | Miami |
Serializing the Table
When the contents of the table changes, the database server writes the updated table to the hard drive. The table is a Python dictionary, so the server needs to write the dictionary to the hard drive. But how do you write a dictionary?
In the operating systems course, we called the write_file function to write data to the hard drive. For example, the following call writes "The speed of sound is 343 m/s" to speed.txt:
write_file("speed.txt", "The speed of sound is 343 m/s")
The problem is that the second argument of write_file has to be a string. Thus, if we want to call the write_file function, we need to convert the dictionary into a string (in a real operating system, data is a stream of 0s and 1s, but we still need to convert the dictionary into 0s and 1s).
We can convert a dictionary to a string recursively. That is, we first convert the entries of the dictionary into strings and then convert the dictionary. If an entry is also a collection, then we first convert the elements of the collection before converting the entry, and so on.
Let's try converting the profiles table to a string. First, we convert the 2 entries in the dictionary. The first entry has key "columns" and value [ "given_name", "family_name", and "weight" ]. Let's start with the key. "columns" is already a string, so we don't need to convert it. But we are going to surround the text columns with quotation marks (") to distinguish it from other types of data. Thus the result of the conversion is simply:
"columns"
While columns is a string of 7 characters, the serialized version is a string of 9 characters. Also, notice that the serialized value is also a valid Python string expression.
Next we serialize the entry value. Since value is a collection, we first serialize the list elements. Every element is a string, so we surround the values with quotation marks as before. Then we serialize the list by inserting commas between the serialized elements and surrounding the elements with the "[" and "]" characters. Here is the result:
["given_name","family_name","weight"]
The serialized list is a string of 37 characters. Note again that the serialized list is a valid Python list expression.
Now that the entry key and value have both been serialized, we can serialize the entry by inserting a colon (:) between the key and the value as follows:
"columns":["given_name","family_name","weight"]
This is a string of 47 characters.
Let's now serialize the second dictionary entry. The entry key is "rows". The entry value is a list of lists. First, we serialize the elements inside the list. The 3 elements are also lists. Thus, we first serialize the elements inside the first element. The elements are "Mary", "Smith", and 60. We serialize "Mary" and "Smith" as before. The last element is 60, which is an integer. We serialize integers by converting each digit to an ASCII character. For example, 60 becomes a string of 2 characters "6" (ASCII value of 54) and "0" (ASCII value of 48). But this time, we do not surround the characters with quotation marks.
Now we can serialize the inner list as before. Here is the result:
["Mary","Smith",60]
This is a string of 19 characters.
We then serialize the next two elements as follows:
["Emma","Lin",51]
["Lucas","Smalls",82]
We can then serialize the parent list as follows:
[["Mary","Smith",60],["Emma","Lin",51],["Lucas","Smalls",82]]
Next, we serialize the parent entry as follows:
"rows":[["Mary","Smith",60],["Emma","Lin",51],["Lucas","Smalls",82]]
Finally, we can serialize the parent dictionary. To serialize a dictionary, we insert commas between the serialized entries and surround the entries with the "{" and "}" characters. Here is the result:
{"columns":["given_name","family_name","weight"],"rows":[["Mary","Smith",60],["Emma","Lin",51],["Lucas","Smalls",82]]}
We have successfully serialized the Python dictionary in JavaScript Object Notation (JSON). Conveniently, the JSON formatted string is also a valid Python expression.
The database server also needs to be able to read the dictionary back from the hard drive. For example, when we start the database server process, the process reads in the database tables from the file system so that the process can serve client queries.
After reading the JSON formatted string from the file system, the parser creates the following variables:
- stack (an empty stack)
- token (an empty stack)
- in_string (false)
The parser then iterates over the text, 1 character at a time. The following table describes how the server processes each character:
Character | Action |
---|---|
{ | Push the character to the stack. |
[ | Push the character to the stack. |
Alphanumeric character | Push the character to token. |
" | If in_string is false, then set in_string to true. If in_string is true, then the parser just finished processing a string, so push the processed token to the stack and reset token to an empty stack. |
, | A token just ended. If the length of the token is greater than 0, then convert the token to a number and push it to the stack. |
: | A dictionary entry key just ended. If the length of the token is greater than 0, then convert the token to a number and push it to the stack. |
] | A list just ended. Start popping the stack until the popped element is '['. Construct a list with the popped elements and push the list to the stack. |
} | A list just ended. Start popping the stack, 2 elements at a time, until the popped element is '{'. Each pair of elements is the entry key and value. Construct a dictionary with the popped entries and push the dictionary to the stack. |
The following is an example animation of the JSON parsing algorithm:
In Python, we can serialize a Python object using the built-in json module. Logicwalk Python does not support modules, but it provides a built-in function called json_dumps that performs the same function. For example, the following serializes the profiles table and writes the output to a file:
serialized = json_dumps(profiles) write_file('profiles.json', serialized)
The Python equivalent function is json.dumps.
To deserialize, or decode, the JSON formatted text into a Python object, use the json_loads function as follows:
serialized = read_file('profiles.json') profiles = json_loads(serialized)
The Python equivalent function is json.loads.
Please choose the true statements
Please write the LW Python expression that serializes a variable called lesson into a JSON formatted string and stores the result in lesson_text.
Please write the LW Python expression that deserializes a variable called question_text into a dictionary and stores the result in question.
Comments
Please log in to add comments