Field names, column names, element names, attributes and keys

You may find the following useful if you want to tighten up the language used to reference the name of data elements.

We got interested in this problem because we were reviewing some transform specifications and went down a rabbit hole of semantic pedantry.

The solution concept was to look at each data context in turn and analyse what we were calling the name of the data element.

In Kafka, the Kafka message itself has a key and a value. Each of these is treated by Kafka as a byte array, and we can provide JSON, Avro, a simple string or any other serialised format. Typically we might use a simple string for the key, and Avro for the value. A Kafka message may also have headers, which are an ordered list of additional key-value pairs, each of which is itself a byte array. These are typically used for metadata rather than content data.

In Kafka, talk about the Avro, JSON or other content of the message value using terms specific to the serialisation format used.

Avro has a schema which allows the bytes content of the value to be interpreted as a record which is made of fields. Fields have a type, name and value. The type can itself be a record, or can be a primitive.

In Avro, talk about a field name.

JSON format content is made of key-value pairs.

JSON schemas have a properties key which leads people to talk about JSON key/value pairs as properties.

In JSON, talk about a key.

In XML format, talk about an element name.

In Parquet or ORC format, talk about a column name.

In CSV format, talk about a field name, but in TSV format, talk about a column name.

In a database talk about a column name.

When using an Object to represent the data, if you mean the getter/setter then in Java or Python, talk about a property. If you mean the actual variable on an instance of an object, then in Java talk about a field and in Python talk about an attribute. Don’t talk about attributes in Java, it’s a vague term in a Java context. 

In Python when using a dictionary to represent the data, or in Java when using a Map to represent the data, talk about a key.