Data Architecture: What is it and Why should we care?

Basic tenets of Data architecture

Data Architecture, as understood by most in the industry, has many different definitions. Here is what Wikipedia says – “In information technology, data architecture is composed of models, policies, rules or standards that govern what data is collected, and how it is stored, arranged, integrated, and put to use in data systems and within the organization. Data is usually one of several architecture domains that form the pillars of an enterprise architecture or solution architecture”. I would define Data Architecture as a discipline that deals with designing, constructing and integrating an organization’s data asset so it can be well optimized for the organization to run its business.

HDTS infographic
Data Architecture Vs House Architecture

Enterprise Architecture is often compared with the architecture of a house, which defines individual design elements of nook and corners of the house to build it to specifications. Similarly, the data architecture is a design that defines the way data enters the organization, lives in its systems and applications, moves within the organization and is consumed to run the business. It is a blueprint of the data design. Most enterprises deal with unmanageable data sprawl that continues to grow at tremendous speed. This “just a bunch of data”, or JBOD, is a major driving force behind the need for a data strategy and an enterprise data architecture.  It is much like a road network to reach to IT goal and thus business goal while data strategy defines “how” to reach that goal. In principle, the data architecture defines a framework that helps in organizing data ingestion, data storage & management, and data processing. Different components of this architecture would include data integration, DBMS selection, Data modeling, performance and measurement, security and privacy, Business intelligence, metadata, Data quality, and data governance.

The analogy of house design (see attached table) works well in this context to understand what components need to be taken care of when we talk about data architecture.

Data Architecture House Architecture
Policies, Rules, and Standards Code of house building
Policies, rules, and standards are the first thing required to build the house. One can use existing industry standard frameworks such as TOGAF, Zachman etc. These are the guard rails for a full life cycle of the data within the organization.
Data Subject Areas (Inventories) Naming the space e.g. living room, kitchen
Naming each space helps to know what types and categories of data are being used in the organization. It is the inventory of the data-space an organization has.
Data Models House plan
The data model is the actual diagram of various data entities at the conceptual, logical and physical level. These are the detail levels of data classifications that help with the collaborating, implementing, and testing of data specifications on the system.
Meta-data Room specifications
This is the lowest and the last level of detail about the data that describes the properties of the data being stored. This is where data context is added.
Integration Utilities hookup
This is where data movement between systems is handled. An integration plan includes what and how data is transferred and managed in-flight.
Data Residents of the house
Data is the resident of the house that lives and moves and is archived, deleted, and updated. A great data architecture plans the organic data growth for foreseeable future.

So why do we need a road network to reach our goals? Well…because we all value order than chaos. We like to follow etiquettes versus no rules of conduct. Okay… but what help will this “order” provide us and how? This question is simple to answer but rather difficult to execute. The enterprise data architecture helps us onboard the data quickly and delivers clean and trustworthy data at the speed required by patrons and business. It also ensures that data is handled in a more secure and compliant way, which may be required by local laws and regulations. It makes it easier to incorporate new data types and technology and enables end-user self-service. The list goes on, but the cardinal point is that data architecture ties it all together.

However, it is unwieldy to build an enterprise data architecture from scratch that can meet our need. A more pragmatic approach would be to build the future state of architecture in each new strategic business initiative. More in next issue…

%d bloggers like this: