DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Enterprise AI Trend Report: Gain insights on ethical AI, MLOps, generative AI, large language models, and much more.

2024 Cloud survey: Share your insights on microservices, containers, K8s, CI/CD, and DevOps (+ enter a $750 raffle!) for our Trend Reports.

PostgreSQL: Learn about the open-source RDBMS' advanced capabilities, core components, common commands and functions, and general DBA tasks.

AI Automation Essentials. Check out the latest Refcard on all things AI automation, including model training, data security, and more.

Related

  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • Introduction to Couchbase for Oracle Developers and Experts: Part 2 - Database Objects
  • Keep Calm and Column Wise
  • SQL Commands: A Brief Guide

Trending

  • JUnit, 4, 5, Jupiter, Vintage
  • Debugging Streams With Peek
  • Using My New Raspberry Pi To Run an Existing GitHub Action
  • Continuous Improvement as a Team
  1. DZone
  2. Data Engineering
  3. Data
  4. Datasets Staticity Levels

Datasets Staticity Levels

Choose wisely the correct solution to store data according to its change frequency. We define here three levels that may be implemented from enums to tables.

By 
Bertrand Florat user avatar
Bertrand Florat
·
Jul. 18, 23 · Analysis
Like (3)
Save
Tweet
Share
3.6K Views

Join the DZone community and get the full member experience.

Join For Free

A common challenge when designing applications is determining the most suitable implementation based on the frequency of data changes. Should a status be stored in a table to easily expand the workflow? Should a list of countries be embedded in the code or stored in a table? Should we be able to adjust the thread pool size based on the targeted platform? 

In a current large project, we categorize datasets based on their staticity level, ranging from very static to more volatile: 

Level 1: Very Static Datasets 

These types of data changes always involve business rules and impact the code. A typical example is the list of states in a workflow (STARTED, IN_PROGRESS, WAITING, DONE, etc.). The indicative size of this dataset is usually between 2 to 20 entries. 

From a technical perspective, it is often implemented as an enumeration (a finite list of literal values like Enumerated Types in PostgreSQL, enums in Java, or TypeScript, for instance). Alternatively, it can be managed as constants or a list of constants. 

You can use the following litmus test: "Does any item from this list need to be included in an 'if' statement in the code?". 

Changing this type of data requires a new release and/or a Data Definition Language (DDL) change and is not easily administrable.  

Level 2: Rarely Changing Data 

Think of datasets like a list of countries/states or a list of currencies. These datasets rarely exceed a few tens of entries. We refer to them as "nomenclatures." 

From a technical standpoint, they can be managed using a configuration file (JSON/YAML/CSV/properties, etc.) or within a database (a table if using a relational database like PostgreSQL, a document or a list of documents if using a NoSQL Document database like MongoDB, etc.). 

It is often a good idea to provide an administration GUI that allows adding, changing, or removing entries of this kind if your budget permits.  

These lists are often required to initiate the use of an application, even if the data may change later on. Therefore, it is advisable to package the application with a minimal dataset before its first use. For example, a Liquibase configuration can be released with the application to create a minimal set of countries in the database if it doesn't exist yet. However, be cautious to use an idempotent "CREATE IF NOT EXIST" scheme to avoid conflicting with preexisting data. 

Depending on the packaging and technologies used, a change in this type of data may or may not require a new release. If your application includes a mechanism for embedding a minimal dataset (such as a configuration file or a Liquibase or SQL script executed automatically), it will likely require a new release. While this may initially be seen as a constraint, it ensures that your application is self-contained and always operational from its deployment, which is often worthwhile. 

 When storing nomenclatures in a database, a common strategy is to create a table for each nomenclature (e.g., a table for currencies, a table for countries). If, like us, your application requires a more flexible approach, you can use a single NOMENCLATURE table for each microservice and differentiate the nomenclatures using a simple column (e.g., a NOMENCLATURE name). All nomenclatures are then consolidated in a single technical table, and it is straightforward to retrieve a specific nomenclature using a WHERE clause on the nomenclature name. If you want to maintain an ordering, you can further enhance this approach by assigning an ordinal value to each nomenclature entry.   

Level 3: Volatile Data 

Most applications persist large amounts of data, which we refer to as "volatile data." This type of data can involve an unlimited number of records managed by an application, such as user profiles, addresses, or chat discussions. 

A change, addition, or removal of a record in this kind of dataset should never require a new release (although backups are still necessary). The code is generally designed to handle such changes in a generic manner rather than on a case-by-case basis. 

This type of data is typically not administrable through code changes but is managed through regular front/back-office GUIs or batch programs.   

Summary 

Choosing the appropriate level of staticity is crucial to ensure the maintainability and modifiability of an application and can help avoid potential pitfalls. Using an incorrect solution to handle a particular staticity level can lead to unnecessary integration and release tasks or make the application less maintainable.

           Level 

Change frequency

Indicative size

 

Administrable?

Does change require a new release?

Technical solution examples

1

low

2-20

no

yes

List of constants, Java enum, Enumerated PostgreSQL type

2

medium

10-100

yes

Depends on the chosen solution

Nomenclature table, configuration file

3

high

> 100

no

no

Regular database records

 

Data definition language Document JSON Relational database Data (computing) PostgreSQL

Published at DZone with permission of Bertrand Florat. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • Introduction to Couchbase for Oracle Developers and Experts: Part 2 - Database Objects
  • Keep Calm and Column Wise
  • SQL Commands: A Brief Guide

Partner Resources


Comments

ABOUT US

  • About DZone
  • Send feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: