Optimizing data management with disparities in data value Academic Article uri icon

abstract

  • ABSTRACT When there is a disparity in the value of different data records and fields, there is a need for an optimization of data resources. Not all data necessarily contribute the same value. It depends on the usage of the data, as well as a variety of other factors. This paper presents models for optimizing data management in the presence of a disparity between the values contributed by different data. We expound on what disparity of data value represents and illustrate models to derive a numerical measure of such disparity. We then use real-world data from a large data resource used to manage alumni relations, and demonstrate our optimization methods and results. We then discuss the tradeoffs involved between value and cost, and the implications for data management, both in this real-world context and in general. KEYWORDS: Disparity, Data Value, Data Management, Optimization INTRODUCTION Organizations make large investments in technology to manage data, a critical organizational resource. In making these investments organizations have to carefully evaluate the information systems and technology that are used to manage the data. Typically, such evaluations are primarily based on technical requirements such as storage capacities and processing speeds and on functional requirements such as presentation formats (e.g., dashboards and visualization) and business needs such as speed of delivery and search capabilities. In this paper, we suggest that evaluating data management systems must consider yet another perspective, economic aspects. In no way minimizing the importance of technical and functional aspects, we suggest that the design and management of data resources ought to also consider the cost-benefit tradeoffs associated with managing data resources. To emphasize this perspective we argue that all data should not be treated as contributing equally to the benefit derived from using a data resource and that some records in a dataset may contribute more to benefit than others. We refer to this as disparity in the value derived from the data (or value disparity). We believe that understanding value disparity and the associated value/cost tradeoffs, has important implications for data management. Not only can it impact how we use data, it can also impact the design and management of data resources and associated information systems. We examine value disparity in a large data resource used to manage alumni relations. Using this as a context, we first evaluate value disparity and show that it is significantly large in this data resource. We then describe the current data acquisition and management policies, discuss the relationship between the policies and the value disparity identified, and in this context, highlight related implications for data management including how the data resource may be optimized considering an economic perspective. The rest of the paper is organized as follows. First we describe the research relevant to value/cost tradeoffs in data management to define the scope of our research. We then develop our models for assessing the magnitude of value disparity. We further illustrate the application of our models using sizable samples from the data resource for managing alumni relations. We finally discuss the implications of understanding value disparity for data management and conclude with a discussion on the limitations of our research. BACKGROUND Managing data efficiently and effectively helps organizations realize the business value of the data. Data contributes to business value in multiple different ways. Data supports managing operational activities such as tracking supply chain activities (Gattiker & Goodhue, 2004) and customer relationships (Roberts and Berger, 1999). Data also helps organizations gain competitive advantage through analytics (Davenport, 2006) and decision support (March & Hevner, 2007, Ramakrishnan et al. …

publication date

  • January 1, 2015