Data-driven public administration - analysis of CVR and accounting data

Danish government organisations are at the forefront internationally when it comes to digitalisation. However, much potential is still to be unlocked. For example, today Danish SMEs are encumbered with rules, regulations and deadlines from public administration, and the Danish Business Authority believes that public administration should be transitioning from a burden to a support function for SMEs.

With a high-quality company registration, data can be used to predict which companies are in risk of bankruptcy, are attempting fraud, or have special potentials for growth. Such predictions can on the one hand save society for losses and on the other hand targeted support can be provided for companies in risk or growth zones.

The research activities should provide new data analytics methods and tool prototypes that the domain experts at the Danish Business Authority can use to increase quality and efficiency of their audits and supervision to companies. Moreover, research should provide methods for securing high quality data registration by real time consistency checks of the data at entry time.

We analyse the need for further collection, refinement, distribution and use of public data in selected areas. For example, we will consider data from the Danish Business Authority on, e.g. business entity, company branch, location, finances, and on owner, management and board organisation. We analyse such data for relationships, patterns, and outliers in order to support evi­den­ce-based prediction of bankruptcy, fraud and growth potentials. Predictive use of such data is e.g. likely to support classi­fi­ca­tion of companies in terms of growth, profitability and risk of bankruptcy. As a very specific study, we consider the use of data from digital self-service platforms, where companies yearly submit more than 11 million online forms.

The methods applied are data cleaning methods, network analysis methods, machine learning and visual analytics. In particular, the tools for the domain experts at Danish Business Authority will be based on a combination of machine learning and visual analytics. We provide the domain experts with interactive visual analytics tools to enable them to iteratively develop the optimal feature selection and labelling of data for machine learning based clustering.

We will develop concrete tools to better utilise transactions data from VIRK to minimise the burdens on SMEs: a better implementation of tracking tools and better analysis of data are likely to lead to better forms, and thus have a significant impact in terms of lowered burdens for SMEs.

In terms of research there are several potential contributions.

  1. Understanding the needs for consistency and quality assurance of data entry to be able to perform network analysis and machine learning inference on data.
  2. Understanding what tools are needed for domain experts to undertake their own data analysis.
  3. How to efficiently combine visual analytics and machine learning.

 

The development of methods and tools will have the potential to radically improve the quality and precision of company audits undertaken by authorities, and it will reduce the SMEs’ expenditure on reporting and auditing, which is estimated to several billion Danish kroner per year.

  • Danish Business Authority
  • Department of Computer Science, Aarhus University
  • DTU Compute
  • Visma
  • Alexandra Institute