what is an aws glue 'merge key'?
An AWS Glue “merge key” is the field or set of fields used to match records between two datasets when Glue performs a merge. In practice, it usually means the primary key columns that tell Glue which incoming row corresponds to which existing row.
How it works
When Glue merges a staging dataset into a target dataset, it compares the merge key values to find matches. If a match is found, Glue can update the existing record; if no match is found, it can insert a new one.
Example
If customer_id is the merge key, then a new row with customer_id = 42 will
be matched to the existing row with customer_id = 42. That lets Glue treat
the new row as an update instead of a separate record.
Important detail
Glue does not automatically deduplicate records that share the same merge keys, so if your source has duplicates, you may still need to clean them up before or after the merge.
TL;DR
A merge key in AWS Glue is basically the identifier column(s) Glue uses to decide whether two rows are the same record during a merge.