Skip to content

Data Ingestion#

Data Ingestion is the first stage in most data architecture designs. The process has 2 steps. First, it consumes data from assorted sources. Second, it loads data into centralized storage, which can be accessed and used by the organization.

Warning

it's a critical component in the data engineering because downstream systems rely entirely on the ingestion layer's output.

2 steps data ingestion

The ingestion layer works with various data sources, which data engineers typically don't have full control of.

Note

A good practice is building a layer of data quality checks and a self-healing system to react to unexpected situations, such as data loss, corruption, system failure, etc.

Generally, there are several types of data ingestion, as below:

Various methods to perform data ingestion:

Reference: Educative - Data Ingestion