The Minimal Information a Transformation Needs

A field account of the basic information every transformation requires before it can begin — why it is rarely treated as a project cost and risk, how that omission distorts the business case, and how estimating its price may help predict success or failure and prevent avoidable use of organizational resources.

Document Status — Field Notes / Working Paper · Series: The Cost of Clarity (flagship)
AI Integrity Management working group, The Integral Management Society · Iván Abril Palma

Previous papers in this series:
• Paper 1: When the Problem Isn’t the Technology
• Paper 2: When Asking the Question Changes the Answer
• Paper 3: When You Have to Decide Before You Can Discover
• Paper 4: When Cleaning Up Means Betting Blind
• Paper 5: How to Measure the Cost of Not Knowing
• Paper 6: Is Clarity Getting More Expensive?

See also: The Human Intelligence Gap Series

Abstract

Every transformation of a technology estate — rationalization, automation and AI enablement, process mining, guardrails, or data modernization — depends on a minimum set of information about what is being changed. The exact information required varies by transformation, but it normally includes a subset of the following: the boundary of the object or process, its accountable owner, its authoritative data sources, and the costs, dependencies, usage, value, or risks relevant to the intervention.

In practice, this information is often missing and rarely treated as a cost and risk of the project itself. It is assumed to be already available or to be supplied by the organization, so producing it does not enter the business case. When it cannot be provided, the project is described as blocked by organizational unreadiness rather than by an unpriced requirement of the method.

This paper argues that estimating the cost and risk of producing this minimum information can improve the business case and help organizations avoid transformations whose full cost may exceed their expected value. It also proposes that this measure may help predict transformation success or failure, a proposition that can be tested against past projects. A companion paper examines how this unpriced cost accumulates and is ultimately paid in human intelligence.

Motivation

I first encountered this problem at Nokia’s R&D centre in Barcelona, where technically advanced projects were closed despite capable people and valuable work. I am an Enterprise Architect with a primary focus on IT, supported by a background in economics and psychology. Over the following twenty-five years, advising CIOs and global firms on rationalization and transformation, I repeatedly observed the same pattern: decisions that appeared mistaken were often difficult to take because the required information, authority, and accountability were structurally separated.

This paper examines that prior problem: why the architectural decisions needed to enable transformation are often so difficult to make and sustain that they remain unresolved.

The same point of failure

Several kinds of transformation dominate the modern technology estate, and they look unrelated. They are not. Each is, at bottom, a way of producing tracked, usable information about how the organization works — and each fails, repeatedly, at the same point, for the lack of a more basic information it assumed it already had.

Rationalization removes duplicated systems to simplify the estate.
Automation, and now AI enablement, moves work from people to machines.
Process mining reconstructs how work flows from the digital traces it leaves.
Guardrails — operational and machine-learning limits — keep a system inside known, safe bounds.
Data modernization rebuilds how data is stored, integrated and served.

The studies are consistent, and so is the cause they name. Rationalization seldom realizes its intended benefits (Gartner), and roughly half of completed modernization programmes did not reduce the debt they targeted (McKinsey) — because the live systems carry dependencies, owners and costs no one can produce, so only the already-dead are retired. AI fails the same way: across thirty years and every wave — data warehousing, CRM, big data, data science, now generative AI — the number-one reported cause is identical, the data isn’t ready; RAND finds more than 80% of AI projects fail, about twice the rate of ordinary IT, on inadequate data. Process mining cannot build its own input without prior knowledge: extracting an event log requires choosing a case notion and locating the right data across scattered tables — exactly the domain knowledge the exercise was meant to recover (van der Aalst; Suriadi and colleagues). Guardrails fail because their bounds cannot be set without knowing a system’s normal range; set blind, they flood operators with false alarms until the alarms are switched off. And data modernization — the one transformation supposed to be the cure for all the others — fails for the very same reason: it too needs prior information about the data flow before it can begin.

A common response is: modernize the data first, and the rest will follow. But data modernization depends on some of the same prior conditions. Empirical studies of data quality and data governance repeatedly identify unclear roles, fragmented accountability, weak mandates, and unresolved decision rights as major barriers to effective data management (Cheong & Chang, 2007; Weber, Otto & Österle, 2009; Haug et al., 2013).

Technical teams can profile, clean, integrate, and trace data, but they cannot independently decide which conflicting definition should prevail, which source should be authoritative, or which business party must accept ownership. Studies of distributed and domain-oriented data management show that transferring data ownership and establishing federated governance remain central implementation difficulties even in modern data architectures (Vestues et al., 2022; Bode et al., 2023; Goedegebuure et al., 2023).

The required knowledge may be distributed across several specialists, but an identifiable accountable authority must ultimately establish the definitions, ownership, and source status that the modernized environment will use. Data modernization can reveal inconsistencies and provide the technical means to resolve them; it cannot by itself create the organizational authority needed to decide among them.

Enterprise-architecture repositories face the same limitation. They can structure information about applications, owners, dependencies, and costs, but their value depends on the quality, use, and organizational anchoring of that information (Urbach, Lux & Riempp, 2010; Lange, Mendling & Recker, 2012; Ehrensperger, Sauerwein & Breu, 2020). If application boundaries or ownership have not been authoritatively decided, the repository can record only provisional or conflicting answers; governance, management support, and knowledge exchange are still required to make its contents actionable (Foorthuis et al., 2016).

Across rationalization, automation, process mining, guardrails, data modernization, and EA management, the recurring shortfall is therefore the same: a basic information unit — the minimum agreed and accountable information about boundaries, ownership, authoritative sources, costs, dependencies, and risks. The exact subset varies by transformation, but no intervention can proceed reliably when the information it requires is missing or merely assumed.

Two clarifications follow, and they are the heart of the contribution. First, the basic information unit is not one more transformation alongside the others; it is the minimal governance and information architecture they each presuppose — the accountable decisions about boundaries, ownership, and canonical sources that rationalization, modernization, mining and the rest consume as an input and cannot themselves produce. The literature above makes the same point from its own side: these are problems of roles, mandate, and decision rights, not of technique. Second, it is minimal by design — not a comprehensive enterprise-architecture programme, which is itself one of the interventions that stalls here, but only the smallest set of declarations the chosen transformation actually requires to begin.

Why the information is not there

It is tempting to assume the missing information is simply undocumented. Often, it was never decided.

Data still flows and operations continue, but the architecture of that flow may be unclear: which source is canonical, who owns the data, and who is accountable for the application or process as a whole. These are not always facts that can be discovered. They may require an authoritative decision.

What was never decided cannot be supplied as an input to transformation. Rationalization lacks reliable ownership and cost, process mining and automation lack accountable process definitions, and guardrails lack accepted operating boundaries.

Informative vs. Declarative Information: the impossibility of gathering minimal information without changing it and the process itself

Two kinds of missing information must be distinguished. Informative information already exists and can be retrieved. Declarative information does not exist until an authorized party establishes it. “Department A owns this application” is not simply a fact to discover; it is a decision that creates ownership and accountability.

Some minimal information therefore cannot be gathered without changing the process itself. Declaring the canonical source, application boundary, owner, budget owner, or accountable party changes responsibilities, budgets, and behaviour.

Producing this information carries three distinct costs:

The cost of looking: identifying the existing information, its sources and contradictions, and determining what can be retrieved and what remains undecided.
The cost of deciding: obtaining the authority, agreement, and judgement required to establish the missing ownership, boundaries, canonical sources, and accountability.
The cost of absorbing: adapting systems, budgets, teams, integrations, and operations to the consequences of those declarations.

A remark, by example: the need for declarative information becomes clear when fifteen people identify themselves as responsible for the same application, but none is accountable or owns its budget. They use different names and boundaries for the application, identify different canonical sources for the same information, state contradictory critical dependencies, and maintain different backlogs, deployment and integration teams. These are informative statements about the current situation, but they do not provide a stable basis for meaningful transformation (or even for a deep process-mining or rationalization study). An authorized decision is still required to establish which answers are authoritative.

Once those declarations are made, the baseline itself changes. The process and the application are no longer organized in the same way: tasks, responsibilities, budgets, interfaces, and expectations are reassigned. Declarative information is therefore not only an input to transformation; producing it is already an intervention in the process and the application.

Entanglement, and why declarative decisions need to climb

Not all information is equally entangled, but the most critical declarations often affect many systems and teams. Take a single field such as “client name.” If seven applications store it differently, partly merge the records, and no one has authority to define what counts as the client record, selecting a canonical source is not a technical update. It is a decision that must reconcile seven systems and all the parties that depend on them.

The more systems, budgets, and responsibilities a declaration affects, the higher it must climb to become binding. This increases the cost of deciding because fewer people have the required authority, and they need more coordination and prior information.

But as the decision moves upward, it also moves away from the operational knowledge needed to make it well. Those closest to the process understand the detail but may lack authority; those with authority may lack the necessary context. The decision therefore rises to where it can be authorized and away from where it can be best understood. This is the governance bottleneck created by entanglement.

The risk of declarative decisions

The cost is the effort and seniority required to make a declaration. The risk is that the declaration is wrong because hidden dependencies were not visible, and something valuable is disrupted.

Cloud-cost ownership illustrates this. Finance, FinOps, licence owners, and technical teams may all contribute to the same cost picture. A clean upstream assignment of ownership can look correct but break the dependencies that made the process work in practice: FinOps needs financial and usage data to optimize costs, while producing reliable usage and cost data depends on ownership and optimization decisions that FinOps cannot make without those same inputs.

The unpriced precondition

The issue is not that the cost and risk of producing the required information are impossible to measure. The issue is that transformation methods normally treat this information as a precondition to be supplied by the organization, rather than as part of the intervention itself.

A rationalization business case, for example, may forecast a 30% cost reduction or a fourfold return on the consulting and implementation expenditure, while assuming that application boundaries, ownership, costs, dependencies, usage, and business value are already available. The effort and risk required to produce those inputs are excluded. If they cannot be supplied, the project later records the same information gap as organizational unreadiness or as a cause of failure.

The expected return is therefore calculated against the cost of executing the transformation, but not necessarily against the full cost of making the transformation possible.

Measuring the cost and risk of the initial information

Once this minimum information is recognized as something that must be produced rather than assumed, its cost and risk can be estimated before the organization commits to the full transformation.

The initial gathering already provides the necessary evidence:

Discovery cost: the time, expenditure, and coordination required to identify the relevant systems, data, owners, and dependencies.
Declarative burden: the proportion of required information that cannot be retrieved and instead requires an authoritative decision.
Decision cost: the effort and organizational level required to establish ownership, boundaries, canonical sources, and accountability.
Absorption cost: the changes to teams, budgets, systems, integrations, and responsibilities created by those declarations.
Value at risk: the business value found in newly discovered or poorly understood parts of the estate, estimated through representative sampling.
Unseen scope: the probable size of what remains undiscovered, estimated from the rate at which successive gathering exercises continue to find new elements.

The organization’s reluctance to retire systems it does not understand also provides evidence of perceived risk. It is not a complete valuation, but it shows that the unknown is not treated as having zero cost or zero potential value.

The full discovery does not need to be completed before these readings begin. Cost and risk become visible through the initial attempt to produce the information.

A candidate predictor

This creates a testable proposition: transformations facing a higher cost of clarity, a larger declarative burden, and greater value exposure should be more likely to stall, fail, or deliver a lower return.

The proposition can be tested retrospectively. Initial inventories, interviews, discovery records, governance escalations, and project post-mortems can be compared with the eventual outcomes of past transformations. Existing studies already identify inadequate data, unclear dependencies, weak governance, and organizational unreadiness among recurring causes of failure. The additional test is whether the measured cost and risk of producing the required initial information correlate with those outcomes.

If that relationship is confirmed, the measure could help organizations estimate not only whether a transformation is technically possible, but whether it is economically justified before committing the larger investment.

An informed decision

Prediction is not the only value of the measurement. Its more immediate use is to improve the decision before the transformation begins.

By estimating the cost and risk of producing the required information, the organization can decide whether to proceed, wait, reduce scope, or choose a different intervention. Today, these costs are often absent from the business case and appear only later, when the project stalls.

Including them beside the expected benefits turns a partial business case into a more informed decision.

References

Bode, J., Kühl, N., Kreuzberger, D., Hirschl, S., & Holtmann, C. (2023). Towards avoiding the data mess: Industry insights from data mesh implementations. arXiv:2302.01713 (subsequently in IEEE Access, 2024).

Cheong, L. K., & Chang, V. (2007). The need for data governance: A case study. In Proceedings of the 18th Australasian Conference on Information Systems (ACIS 2007).

Ehrensperger, R., Sauerwein, C., & Breu, R. (2020). Current practices in the usage of inter-enterprise architecture models for the management of business ecosystems. In 2020 IEEE 24th International Enterprise Distributed Object Computing Conference (EDOC) (pp. 21–29).

Foorthuis, R., van Steenbergen, M., Brinkkemper, S., & Bruls, W. A. G. (2016). A theory building study of enterprise architecture practices and benefits. Information Systems Frontiers, 18(3), 541–564.

Goedegebuure, A., Kumara, I., Driessen, S., van den Heuvel, W.-J., Monsieur, G., Tamburri, D. A., & Di Nucci, D. (2024). Data mesh: A systematic gray literature review. ACM Computing Surveys, 57(1), Article 11 (preprint 2023, arXiv:2304.01062).

Haug, A., Arlbjørn, J. S., Zachariassen, F., & Schlichter, J. (2013). Master data quality barriers: An empirical investigation. Industrial Management & Data Systems, 113(2), 234–249.

Lange, M., Mendling, J., & Recker, J. (2012). Realizing benefits from enterprise architecture: A measurement model. In Proceedings of the 20th European Conference on Information Systems (ECIS 2012).

Ryseff, J., De Bruhl, B. F., & Newberry, S. J. (2024). The root causes of failure for artificial intelligence projects and how they can succeed. RAND Corporation.

Suriadi, S., Andrews, R., ter Hofstede, A. H. M., & Wynn, M. T. (2017). Event log imperfection patterns for process mining: Towards a systematic approach to cleaning event logs. Information Systems, 64, 132–150.

Urbach, N., Lux, J., & Riempp, G. (2010). Understanding the performance impact of enterprise architecture management. In AMCIS 2010 Proceedings.

van der Aalst, W. M. P., et al. (2012). Process mining manifesto. In Business Process Management Workshops (BPM 2011), LNBIP 99 (pp. 169–194). Springer.

Vestues, K., Hanssen, G. K., Mikalsen, M., Buan, T. A., & Conboy, K. (2022). Agile data management in NAV: A case study. In Agile Processes in Software Engineering and Extreme Programming (XP 2022), LNBIP 445 (pp. 220–235). Springer.

Weber, K., Otto, B., & Österle, H. (2009). One size does not fit all—A contingency approach to data governance. ACM Journal of Data and Information Quality, 1(1), Article 4, 1–27.

Gartner and McKinsey figures cited in the text are industry/analyst reports.