Replication Internals: Decoding the MySQL Binary Log Part 7: TABLE_MAP_EVENT – Table Metadata for Row-Based Replication
In this seventh post of our series, we decode the TABLE_MAP_EVENT — the event that maps a numeric table ID to a database/table name and the column layout that the row events immediately following it will reference.
Introduction
The TABLE_MAP_EVENT (event type 19, 0x13) is essential for row-based replication. It appears before any row event (INSERT, UPDATE, DELETE) and provides:
- Table identification: Database name, table name, and a numeric table ID
- Column metadata: Number of columns and their types
- Nullability: Which columns can contain NULL values
Row events reference the table by its numeric ID, so the TABLE_MAP_EVENT must be parsed first to understand the row data that follows.
Event Location
In our binary log, TABLE_MAP_EVENTs appear before each set of row events:
Position 537: QUERY_EVENT - BEGIN
Position 620: TABLE_MAP_EVENT (68 bytes) ← Before INSERT
Position 688: WRITE_ROWS_EVENT
Position 737: XID_EVENT
Position 847: QUERY_EVENT - BEGIN
Position 939: TABLE_MAP_EVENT (68 bytes) ← Before UPDATE
Position 1007: UPDATE_ROWS_EVENT
Position 1079: XID_EVENT
Position 1189: QUERY_EVENT - BEGIN
Position 1272: TABLE_MAP_EVENT (68 bytes) ← Before DELETE
Position 1340: DELETE_ROWS_EVENT
Position 1397: XID_EVENT
Let's decode the first one at position 620.
Reading the Raw Bytes
$ xxd -s 620 -l 68 binlog.000024
0000026c: 3210 3568 1301 0000 0044 0000 00b0 0200 2.5h.....D......
0000027c: 0000 005f 0000 0000 0001 000c 7072 6573 ..._........pres
0000028c: 656e 7461 7469 6f6e 0006 7065 7273 6f6e entation..person
0000029c: 0002 030f 0258 0202 0101 0002 03fc ff00 .....X..........
000002ac: fba8 d0d8 ....
The event is 68 bytes: 19-byte header + 8-byte post-header + 37-byte payload + 4-byte checksum.
Common Header (19 bytes)
From Part 2, we know that every binary log event starts with the same 19-byte common header. Here it is for our event at position 620:
32103568 13 01000000 44000000 b0020000 0000
│ │ │ │ │ │
│ │ │ │ │ └─→ Flags: 0x0000
│ │ │ │ └───────────→ Next Position: 688
│ │ │ └────────────────────→ Event Size: 68 bytes
│ │ └─────────────────────────────→ Server ID: 1
│ └────────────────────────────────→ Event Type: 19 (TABLE_MAP_EVENT)
└─────────────────────────────────────────→ Timestamp: 1748308018
Quick cross-check: 620 + 68 = 688, which matches the Next Position field. ✓
TABLE_MAP_EVENT Structure
The layout is defined by Table_map_event in libs/mysql/binlog/event/rows_event.h and read by its constructor in libs/mysql/binlog/event/rows_event.cpp. The post-header is fixed-size (TABLE_MAP_HEADER_LEN = 8, with TM_MAPID_OFFSET = 0 and TM_FLAGS_OFFSET = 6); the body that follows is variable.
| Field | Size | Description |
|---|---|---|
| Post-Header (8 bytes) | ||
m_table_id | 6 bytes | Numeric identifier for the table (little-endian, 48-bit) |
m_flags | 2 bytes | Bit flags — see TM_*_F constants in sql/log_event.h |
| Body | ||
m_dblen | Packed int | Length of database name |
m_dbnam | m_dblen + 1 | Database name + trailing \0 |
m_tbllen | Packed int | Length of table name |
m_tblnam | m_tbllen + 1 | Table name + trailing \0 |
m_colcnt | Packed int | Number of columns |
m_coltype | m_colcnt bytes | One MySQL type code per column |
m_field_metadata_size | Packed int | Length of the field metadata block |
m_field_metadata | Variable (≤ 4 bytes per column) | Per-column type-specific metadata |
m_null_bits | (m_colcnt + 7) / 8 bytes | LSB-first bit per column; 1 = nullable |
m_optional_metadata | Variable | TLV-encoded extended column information |
Field-by-Field Decoding
Post-Header (8 bytes)
5f0000000000 0100
│ │
│ └─→ m_flags: 0x0001
└──────────────→ m_table_id: 0x000000000005f = 95
The post-header is exactly 6 bytes of m_table_id followed by 2 bytes of m_flags — no padding. The constructor in rows_event.cpp reads the table id with reader.read<uint64_t>(6) (note the explicit length of 6), then reads m_flags with reader.read<uint16_t>().
Table ID
5f 00 00 00 00 00 → 0x000000000005f = 95
The numeric handle for the table. This same 95 will appear in the post-header of every row event (WRITE_ROWS_EVENT, UPDATE_ROWS_EVENT, DELETE_ROWS_EVENT) that follows in this transaction, telling the applier which TABLE_MAP_EVENT it should use to interpret the row payload.
Flags
01 00 → 0x0001 = TM_BIT_LEN_EXACT_F
m_flags is a 16-bit field. The values are defined on Table_map_log_event in sql/log_event.h:2471-2481:
| Mask | Constant | Meaning |
|---|---|---|
0x0001 | TM_BIT_LEN_EXACT_F | Always set when MySQL writes the event |
0x0002 | TM_REFERRED_FK_DB_F | Table is referenced by a foreign-key constraint in another schema |
0x0004 | TM_GENERATED_INVISIBLE_PK_F | Table has a server-generated invisible primary key |
Our event has just TM_BIT_LEN_EXACT_F set, which is the normal case. ✓
Database Name
0c 70726573656e746174696f6e 00
│ │ │
│ │ └─→ Null terminator
│ └─→ "presentation" (12 bytes)
└─→ Length: 12
Database: "presentation"
Table Name
06 706572736f6e 00
│ │ │
│ │ └─→ Null terminator
│ └─→ "person" (6 bytes)
└─→ Length: 6
Table: "person"
Column Count
02
This is a net_field_length_ll packed integer (the same length encoding introduced in Part 1). Since 0x02 < 251, the value is a single byte: 2 columns.
Column Types (2 bytes, one per column)
03 0f
│ │
│ └─→ Column 2: 0x0f = MYSQL_TYPE_VARCHAR (15)
└────→ Column 1: 0x03 = MYSQL_TYPE_LONG (3) = INT
Our table has:
- Column 1:
ID INT→ Type 3 (MYSQL_TYPE_LONG) - Column 2:
name VARCHAR(150)→ Type 15 (MYSQL_TYPE_VARCHAR)
Metadata Length
02
The metadata block is 2 bytes long.
Column Metadata
58 02
Each column writes 0–4 bytes of metadata, depending on its type. The encoding lives in each Field subclass's do_save_field_metadata() in sql/field.cc. For our two columns:
MYSQL_TYPE_LONG(INT, 0x03) — 0 bytes. Plain integers carry no metadata.MYSQL_TYPE_VARCHAR(0x0f) — 2 bytes.Field_varstring::do_save_field_metadata()simply doesint2store(metadata_ptr, field_length), wherefield_lengthis the column's max length in bytes (not characters).
Reading 58 02 as little-endian: 0x0258 = 600. For name VARCHAR(150) declared in a utf8mb4 schema, MySQL reserves 4 bytes per character, so field_length = 150 × 4 = 600. The same column declared as VARCHAR(150) CHARACTER SET latin1 would store 150 here instead — the metadata is always max bytes, never max characters.
Null Bitmap
02
The null bitmap tells us which columns of the table are allowed to hold NULL. It uses one bit per column, packed into as few bytes as possible.
Since a byte holds 8 bits, a single byte can describe up to 8 columns. The moment you have a 9th column you need a second byte (even though only 1 bit of it is actually used), the 17th column triggers a third byte, and so on. In other words, you take the number of columns, divide by 8, and round up to the next whole byte:
bytes needed = ceil(number_of_columns / 8)
Our table has 2 columns, so (2 + 7) / 8 = 1 byte.
Bits inside each byte are read starting from the least significant bit (the rightmost one when you write the byte in binary): the first column sits at bit 0, the second column at bit 1, the third at bit 2, and so on. A bit set to 1 means that column is nullable; a bit set to 0 means it was declared NOT NULL.
Our byte is 0x02, which is 0000 0010 in binary:
02 = 0000 0010
││
│└─→ Column 1 (ID): bit 0 = 0 → NOT NULL
└──→ Column 2 (name): bit 1 = 1 → NULLABLE
That matches the DDL exactly: ID is the PRIMARY KEY (so implicitly NOT NULL), and name was declared DEFAULT NULL.
One important distinction worth calling out: this bitmap only records whether each column can hold NULL — it describes the schema, not the data. Whether a given row actually contains a NULL value for a particular column is answered by a second, per-row null bitmap that lives inside each row of a WRITE_ROWS_EVENT / UPDATE_ROWS_EVENT / DELETE_ROWS_EVENT. We'll meet that one in Part 8.
If you want to see the exact bit-extraction pattern used on the read side, it's in sql/log_event.cc:11735.
Optional Metadata
01 01 00 02 03 fc ff 00
After m_null_bits, the rest of the body (everything up to the CRC32) is the optional metadata block. Each entry is a Type-Length-Value triple:
[type: 1 byte][length: packed int][value: <length> bytes]
The full set of types is the Optional_metadata_field_type enum in rows_event.h:548-573:
| Code | Type | Purpose |
|---|---|---|
| 1 | SIGNEDNESS | Sign bit for each numeric column |
| 2 | DEFAULT_CHARSET | Default charset + per-column overrides |
| 3 | COLUMN_CHARSET | One charset per string column |
| 4 | COLUMN_NAME | Column names |
| 5 | SET_STR_VALUE | String values for SET columns |
| 6 | ENUM_STR_VALUE | String values for ENUM columns |
| 7 | GEOMETRY_TYPE | Concrete geometry subtype |
| 8 | SIMPLE_PRIMARY_KEY | PK column indices |
| 9 | PRIMARY_KEY_WITH_PREFIX | PK columns with prefix lengths |
| 10 | ENUM_AND_SET_DEFAULT_CHARSET | Default charset for ENUM/SET |
| 11 | ENUM_AND_SET_COLUMN_CHARSET | Per-column charset for ENUM/SET |
| 12 | COLUMN_VISIBILITY | Visible/invisible flag per column |
| 13 | VECTOR_DIMENSIONALITY | Dimensionality of VECTOR columns |
Which entries appear depends on the table definition and on binlog_row_metadata (MINIMAL vs. FULL). Our 8 bytes contain two entries:
Entry 1 — SIGNEDNESS
01 01 00
│ │ │
│ │ └─→ value: 0x00
│ └─────→ length: 1 byte
└────────→ type: 0x01 = SIGNEDNESS
This field uses one bit per numeric column in the table (string, blob, and date/time columns are skipped). A bit set to 1 means the column was declared UNSIGNED; a bit set to 0 means it's signed.
There's an important subtlety here. When we decoded the null bitmap earlier in this post, we read the bits right-to-left — the first column lived at the rightmost bit of the byte. SIGNEDNESS does the opposite: it reads left-to-right, starting from the leftmost bit of the first byte. So the first numeric column is the leftmost bit, the second numeric column is the next bit to its right, and so on. When we run out of bits in one byte, we move on to the leftmost bit of the next byte. You can see this walk in parse_signedness().
Our table has exactly one numeric column (ID INT), and the value byte is 0x00:
0x00 = 0000 0000
│
└─→ 1st numeric column (ID): 0 → signed
The leftmost bit — the one belonging to ID — is 0, so ID is signed. That matches the DDL: ID INT was declared without the UNSIGNED keyword.
Entry 2 — DEFAULT_CHARSET
02 03 fc ff 00
│ │ │
│ │ └─→ value: fc ff 00
│ └─────→ length: 3 bytes
└────────→ type: 0x02 = DEFAULT_CHARSET
parse_default_charset() in rows_event.cpp:142-158 reads a packed integer for the default charset id, then walks (col_index, charset) pairs until it consumes length bytes.
- First packed int:
fc ff 00. The leading0xfcis the 2-byte marker fornet_field_length_ll, so the next two bytesff 00are the value:0x00ff= 255 =utf8mb4_0900_ai_ci. - That fully consumes the 3-byte value, so there are no per-column overrides — every string column uses the default.
Together, the 8 bytes tell the applier: "the only numeric column is signed, and every string column is utf8mb4_0900_ai_ci."
Visual Breakdown
Position 620: TABLE_MAP_EVENT (68 bytes)
┌─────────────────────────────────────────────────────────────────────────┐
│ COMMON HEADER (19 bytes) │
├─────────────────────────────────────────────────────────────────────────┤
│ 32103568 │ 13 │ 01000000 │ 44000000 │ b0020000 │ 0000 │
│ Timestamp │ Type │ ServerID │ Size │ NextPos │ Flags │
│ 1748308018 │ 19 │ 1 │ 68 │ 688 │ 0x0000 │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ POST-HEADER (8 bytes) │
├──────────────────────────────┬──────────────────────────────────────────┤
│ 5f0000000000 │ m_table_id: 95 (6 bytes, little-endian) │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 0100 │ m_flags: 0x0001 (TM_BIT_LEN_EXACT_F) │
└──────────────────────────────┴──────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ BODY │
├──────────────────────────────┬──────────────────────────────────────────┤
│ 0c │ m_dblen: 12 │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 70726573656e746174696f6e00 │ m_dbnam: "presentation\0" │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 06 │ m_tbllen: 6 │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 706572736f6e00 │ m_tblnam: "person\0" │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 02 │ m_colcnt: 2 │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 03 0f │ m_coltype: LONG (INT), VARCHAR │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 02 │ m_field_metadata_size: 2 │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 5802 │ m_field_metadata: VARCHAR max = 600 │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 02 │ m_null_bits: col 2 nullable │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 010100 │ SIGNEDNESS: ID is signed │
├──────────────────────────────┼──────────────────────────────────────────┤
│ 0203fcff00 │ DEFAULT_CHARSET: utf8mb4_0900_ai_ci (255)│
└──────────────────────────────┴──────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ CHECKSUM (4 bytes) │
├─────────────────────────────────────────────────────────────────────────┤
│ fba8d0d8 │ CRC32 │
└──────────────────────────────┴──────────────────────────────────────────┘
Table: presentation.person (ID=95)
Columns: ID INT NOT NULL, name VARCHAR(150) NULL
Table ID Lifecycle
The Table ID is assigned by the server and is valid only within a single binary log file or replication session. Key points:
- Transient: The same table may have different IDs in different binary logs
- Referenced by row events: WRITE/UPDATE/DELETE_ROWS_EVENT use this ID
- Reused after FLUSH TABLES: Table IDs may be reassigned
When parsing row events, you must first parse the preceding TABLE_MAP_EVENT to know how to interpret the row data.
Try It Yourself
First, dump the bytes with xxd:
$ xxd -s 620 -l 68 binlog.000024
0000026c: 3210 3568 1301 0000 0044 0000 00b0 0200 2.5h.....D......
0000027c: 0000 005f 0000 0000 0001 000c 7072 6573 ..._........pres
0000028c: 656e 7461 7469 6f6e 0006 7065 7273 6f6e entation..person
0000029c: 0002 030f 0258 0202 0101 0002 03fc ff00 .....X..........
000002ac: fba8 d0d8 ....
Then decode it programmatically. The script below walks the post-header, body, and TLV optional-metadata block end-to-end:
import struct
# net_field_length_ll: MySQL's length-encoded integer
def read_packed_int(data, offset):
first = data[offset]
if first < 251:
return first, 1
elif first == 252:
return struct.unpack('<H', data[offset+1:offset+3])[0], 3
elif first == 253:
return struct.unpack('<I', data[offset+1:offset+4] + b'\x00')[0], 4
elif first == 254:
return struct.unpack('<Q', data[offset+1:offset+9])[0], 9
OPT_TYPES = {
1: 'SIGNEDNESS', 2: 'DEFAULT_CHARSET', 3: 'COLUMN_CHARSET',
4: 'COLUMN_NAME', 5: 'SET_STR_VALUE', 6: 'ENUM_STR_VALUE',
7: 'GEOMETRY_TYPE', 8: 'SIMPLE_PRIMARY_KEY', 9: 'PRIMARY_KEY_WITH_PREFIX',
10: 'ENUM_AND_SET_DEFAULT_CHARSET', 11: 'ENUM_AND_SET_COLUMN_CHARSET',
12: 'COLUMN_VISIBILITY', 13: 'VECTOR_DIMENSIONALITY',
}
TYPE_NAMES = {3: 'INT', 15: 'VARCHAR'}
with open('binlog.000024', 'rb') as f:
f.seek(620)
# Common header (19 bytes)
header = f.read(19)
timestamp, event_type, server_id, event_size, next_pos, flags = \
struct.unpack('<IBIIIH', header)
# Post-header: 6-byte table_id + 2-byte flags
post_header = f.read(8)
table_id = struct.unpack('<Q', post_header[:6] + b'\x00\x00')[0]
tm_flags = struct.unpack('<H', post_header[6:8])[0]
print(f"Table ID: {table_id}")
print(f"Flags: {tm_flags:#06x}")
# Body (everything between post-header and the 4-byte CRC)
body = f.read(event_size - 19 - 8 - 4)
o = 0
db_len = body[o]; o += 1
db_name = body[o:o+db_len].decode(); o += db_len + 1 # +1 for trailing \0
tbl_len = body[o]; o += 1
tbl_name = body[o:o+tbl_len].decode(); o += tbl_len + 1
colcnt, n = read_packed_int(body, o); o += n
coltype = list(body[o:o+colcnt]); o += colcnt
meta_len, n = read_packed_int(body, o); o += n
field_metadata = body[o:o+meta_len]; o += meta_len
null_bytes = (colcnt + 7) // 8
null_bits = body[o:o+null_bytes]; o += null_bytes
optional_metadata = body[o:]
print(f"\nDatabase: {db_name}")
print(f"Table: {tbl_name}")
print(f"Columns: {colcnt}")
# Walk per-column type + metadata + nullability
mi = 0
for i, t in enumerate(coltype):
nullable = bool(null_bits[i // 8] & (1 << (i % 8))) # LSB-first
if t == 15: # MYSQL_TYPE_VARCHAR
max_bytes = struct.unpack('<H', field_metadata[mi:mi+2])[0]
extra = f", max_bytes={max_bytes}"
mi += 2
else:
extra = ""
print(f" Column {i+1}: type {t} ({TYPE_NAMES.get(t, '?')})"
f"{extra}, nullable={nullable}")
# TLV optional metadata
print("\nOptional metadata:")
o = 0
while o < len(optional_metadata):
ftype = optional_metadata[o]; o += 1
flen, n = read_packed_int(optional_metadata, o); o += n
fval = optional_metadata[o:o+flen]; o += flen
name = OPT_TYPES.get(ftype, f'UNKNOWN({ftype})')
if ftype == 1: # SIGNEDNESS — MSB-first bit per numeric column
bits = []
for byte in fval:
for mask in (0x80, 0x40, 0x20, 0x10, 0x08, 0x04, 0x02, 0x01):
bits.append('unsigned' if byte & mask else 'signed')
print(f" {name}: {bits[:1]}") # one numeric column in this table
elif ftype == 2: # DEFAULT_CHARSET
default_cs, n2 = read_packed_int(fval, 0)
print(f" {name}: default_charset={default_cs}, overrides={fval[n2:].hex()}")
else:
print(f" {name}: {fval.hex()}")
Output:
Table ID: 95
Flags: 0x0001
Database: presentation
Table: person
Columns: 2
Column 1: type 3 (INT), nullable=False
Column 2: type 15 (VARCHAR), max_bytes=600, nullable=True
Optional metadata:
SIGNEDNESS: ['signed']
DEFAULT_CHARSET: default_charset=255, overrides=
The full source for this series, including a more complete decoder, lives in the presentations repo.
References
Table_map_event— class definition,TM_MAPID_OFFSET/TM_FLAGS_OFFSETenum, and theOptional_metadata_field_typeenumTable_map_log_eventTM_*_Fflags —TM_BIT_LEN_EXACT_F,TM_REFERRED_FK_DB_F,TM_GENERATED_INVISIBLE_PK_F(defined insql/log_event.h, not in the binlog-events library)TABLE_MAP_HEADER_LEN— post-header length constant (= 8)Table_map_event::Table_map_event(const char*, ...)— the read pathTable_map_event::Optional_metadata_fields— parser for the TLV optional metadata blockparse_signedness()andparse_default_charset()Table_map_log_event::write_data_header()andwrite_data_body()— the write pathTable_map_log_event::init_metadata_fields()— how the optional metadata block is populatedField_varstring::do_save_field_metadata()— 2-byte field metadata forVARCHARTable_id— 6-byte (48-bit) table identifier- MySQL docs:
binlog_row_metadata— controls which optional metadata fields get written
What's Next?
Now that we understand how table metadata is encoded, we're ready to decode actual row data. In the next post, we'll examine the WRITE_ROWS_EVENT — the event that records INSERT operations.
Next up: Part 8: WRITE_ROWS_EVENT — INSERT Operations
This series is based on a presentation given at the MySQL Online Summit. The goal is to help MySQL users understand what goes under the hood of replication by manually decoding binary log files.