IoT System Architectures: Begin with the Data in Mind

Simplistically, there are two components to an IoT system architecture if you look at a basic macro view:
• Things that are generating data
• Information technology that uses data to provide insights and management of those things
A rather simple approach if you had a thing generating data and insights, but we’re relying on minimal data architecture or information technology to manage those processes or controls.
As we dive deeper, we can delineate four sub-components that work together to generate, gather, process and provide insights on the data:
Device | Raw Data | Controls
- The thing
- Gateways, data management and edge nodes
Processing Data | Operations | Management
- Edge IT and data processing
- Data center / cloud
A complete and robust data architecture might have very defined silos around each of the four pieces identified above. Your unique need may however lend itself to a data architecture that is very flexible to allow data aggregation, processing and storage pretty much anywhere in the system – from the thing to the cloud. Generally, the more immediacy for information at the thing, the closer the data processing and insights need to be to the thing.
We will now look closer at each of the four sub-components, and note some of our options for data handling in each area.
1. The “Thing”
It all starts with the thing.
The “thing” in the Internet of Things is an object – a car, tool, a toy, a building, or a really smart grain of rice. This object is typically coupled with sensors that generate data and with actuators that “do stuff” when instructed. These actuators might be responsible for opening a door, closing a valve, or flipping a switch.
The thing could have control over a process, and may or may not want to send data back to the cloud. Although not typical, data could be processed right at the sensor, which would put it at the edgiest of the edge of the network.
Sensors do not necessarily have to be physically attached to the thing. For example, sensors may need to monitor the temperature in your bedroom for proper operation of your super-cooled pillowcase.*
2. Gateways, Data Management and Edge Nodes
Data from your thing may go straight to the cloud, but a local gateway can enable preprocessing and filtering of data. The gateway can also transmit control commands from the cloud to your thing, which is then equipped to execute commands using actuators.
IoT systems can potentially generate terabytes of data, so your particular IoT system architecture may include some or all of these possible edge node components as part of the gateway in close proximity to the thing.
- Analog-to digital data conversion: Analog data can grow in volume quickly, so preprocessing may be required before sending to the data center or the cloud.
- Sensor data aggregation system: A separate physical device that may need to be portable and rugged to withstand environmental conditions.
- Streaming data processor: Ensures no data is lost or corrupted.
Your gateway may also be smarter than the average gateway – and could package up, filter and secure data transmission to the cloud, replacing some of the functions of edge IT.
Now that your data is clean and tidy, it’s almost ready to realize its lifelong dream of joining the cloud. But next, we’ll note how edge IT systems are often a critical step in preprocessing the data, lest you dump a mountain of unnecessary data into the cloud.
3. Edge IT Systems
Your data has been digitized and aggregated, but it may require additional analysis and processing before going to the data center or the cloud. This preprocessing step can select the most meaningful and insightful data points and only pass those on.
As an example, your thing has sensors that generate a constant stream of data on the ft/lbs of torque exerted on a bearing. Your gateway and edge node clean the data to show results in 10-second increments. Your edge IT system would then preprocess that data and only send data to the cloud if it is above or below a designated threshold.
This edge IT system could be located in a remote location, but generally sits in close physical proximity location to the sensors, as there are many speed, bandwidth and security benefits to being on a local network.
4. Data Center / Cloud
When the data requires more in-depth processing and feedback doesn’t have to be immediate, it finds its way to a data center or the cloud. More powerful IT systems are available to analyze, manage and securely store the data.
The cleaned, filtered and preprocessed data can then be combined with data from other sources for deeper insights. Returning to our example of ft/lbs of torque on a bearing, we can look at the dataset of high-torque readings in context over a period of time, and also look at additional variables like the age of the bearing or the speed of the conveyor belt to determine what action or control to assert on the system.
Possible Data Architectures
Your unique data architecture is driven by the volume of data you are generating, and the speed with which you need information, insights, and action. Following are some examples of different data architecture options that vary in complexity.
Garage project: No real architecture. It’s a simple and fast project in a lab or garage. No need for big data crunching.
General starter framework: The data is generated at the thing, and you’re storing data at a data center or the cloud. But you decide where and how you will aggregate and process the data with a combination of gateways, edge nodes, and edge IT.
Built for speed: The data should be processed close to the thing, to quickly generate information and insights. Data storage may be minimal. Big data crunching is not happening.

Complex system: Multiple things as part of a system that requires big data processing and storage, multiple systems to filter, process and combine the data. Controls and insights are delivered back to the system, possibly with machine learning incorporated to improve performance of actuators.
To summarize, the data architecture piece of your IoT system architecture should cover your unique needs for data aggregation, filtering, processing and analysis. Depending on the scope of your project, these functions could be small or large, and could exist on the edge, offsite, or in the cloud. The most critical consideration is the speed and immediacy of the data you will need. This will help you determine how close these functions sit in relation to your thing.
* This is not a real thing. Yet.