SQL.What are domain types (or constraints on data types), and how do they help enforce data correctness?

his is a schema design / data integrity question. Interviewers ask it to see whether you push correctness into the database, not only into application code.


Short, interview-ready answer

Domain types and constraints restrict what values a column can hold, encoding business rules directly in the schema so invalid data cannot be stored, regardless of the application.


1️⃣ What are “domain types” (conceptually)

A domain is a restricted version of a base data type.

Think of it as:

“Not just an INT — an INT with meaning and rules.”

Instead of repeating rules everywhere, you define them once and reuse them.


2️⃣ Constraints vs domain types

Constraints on a column

Rules applied per column:

Domain type (PostgreSQL example)

Rules applied to a reusable type:

Then used like:

Same rule, centralized.


3️⃣ Common constraint types (must know)

NOT NULL

Prevents missing mandatory data.

CHECK

Encodes business rules.

UNIQUE

Prevents duplicates.


FOREIGN KEY

Enforces referential integrity.


ENUM / domain-like restrictions

4️⃣ Why domain types are powerful (senior insight)

❌ Without domains

Rules are duplicated:

  • in services
  • in validations
  • in migrations
  • inconsistently enforced

Eventually:

  • one path forgets validation
  • bad data leaks in
  • bugs appear years later

✅ With domains / constraints

  • Invalid data cannot be stored
  • All writers are protected:
    • app
    • scripts
    • migrations
    • admin tools

The database becomes the last line of defense.

5️⃣ Real-world examples (very interview-friendly)

Example 1: Email

(Not perfect, but enforces basic sanity.)

Example 2: Money

Stores cents, guarantees non-negative.


Example 3: Status

No accidental "new" or "PAYED".


6️⃣ Why this matters for backend engineers

🔴 Application-only validation is not enough

  • Multiple services
  • Background jobs
  • Manual fixes
  • Future code changes

Sooner or later:

Something bypasses validation.


✅ Constraints give you:

  • Data correctness by construction
  • Simpler application code
  • Safer refactoring
  • Easier debugging

7️⃣ Trade-offs (be honest in interviews)

⚠️ Constraints can:

  • Break bad legacy data
  • Require careful migrations
  • Slow writes slightly (negligible vs correctness)

Senior answer:

Correctness > convenience.


8️⃣ Interview-ready final answer (clean)

“Domain types and constraints restrict allowed values for columns, encoding business rules directly into the database schema.
They prevent invalid data from being stored regardless of how the database is accessed, improving correctness, consistency, and long-term maintainability.
They move validation from application code into the data model itself.”


🚩 Red flags interviewers notice

If a candidate says:

  • “We validate everything in code”
  • “Constraints slow the database”
  • “Foreign keys are optional”

👉 That’s not senior-level thinking.

Posted in Без рубрики | Comments Off on SQL.What are domain types (or constraints on data types), and how do they help enforce data correctness?

SQL.How does SQL handle comparisons between different data types (for example, string vs number), and why is this dangerous?

This is a semantic + performance trap question. Interviewers ask it to see whether you understand implicit casting, three-valued logic, and index usage.


Short, interview-ready answer

SQL implicitly casts values to make comparisons possible, but this can change semantics and disable indexes, leading to wrong results and full table scans.
That’s why comparing different data types is dangerous and should be avoided.


1️⃣ What SQL actually does

When you compare different data types, SQL tries to coerce one side to the other:

WHERE user_id = '42'

If user_id is INTEGER, the engine rewrites it (conceptually) as:

WHERE user_id = CAST('42' AS INTEGER)

This is implicit type casting.

Which side gets cast depends on:

  • DB engine (Postgres, MySQL, Oracle)
  • Type precedence rules
  • Context (WHERE, JOIN, expression)

2️⃣ Why this is dangerous — correctness bugs

🔴 Example 1: string vs number

WHERE amount > '100'

Possible outcomes:

  • Numeric comparison (expected)
  • Lexical comparison (in some engines / contexts)

Lexical comparison:

'20' > '100'  -- TRUE (string comparison!)

That’s silent data corruption.

🔴 Example 2: invalid values

WHERE user_id = 'abc'

Some DBs throw an error

Others silently fail or filter everything

Behavior may change after upgrades

3️⃣ Why this is dangerous — performance bugs 🚨

Index-killer pattern

CREATE INDEX idx_orders_user_id ON orders(user_id);

SELECT *
FROM orders
WHERE CAST(user_id AS TEXT) = '42';

Plan:

Seq Scan on orders

❌ Index cannot be used because the column is wrapped in a function.

Golden rule (must remember)

Indexes work only when the column appears “as-is” in the predicate.

Safe:





WHERE user_id = 42

Unsafe:

WHERE CAST(user_id AS TEXT) = '42'

4️⃣ JOINs with mismatched types (very common)

JOIN users u ON o.user_id = u.external_id

If:

  • user_id = INTEGER
  • external_id = VARCHAR

Then:

  • One side is cast
  • Index on that side becomes unusable
  • Join degenerates to hash join or full scan

This turns OLTP joins into OLAP-style work.

5️⃣ NULL + implicit casting = chaos

WHERE user_id = NULL
  • Comparison evaluates to UNKNOWN
  • Rows are silently filtered out

Combined with casts, this becomes extremely hard to debug.

6️⃣ How senior engineers avoid this

✅ Use consistent schema types

  • PK and FK must have identical types
  • No INT ↔ VARCHAR joins

✅ Cast constants, not columns

WHERE user_id = CAST(? AS INTEGER)

❌ Never:

WHERE CAST(user_id AS TEXT) = ?

✅ Use typed literals

DATE '2026-01-28'
TIMESTAMP '2026-01-28 10:00:00'

✅ Let SQL fail loudly

Errors > silent coercion.


7️⃣ Interview-ready final answer (clean)

“SQL allows comparisons between different data types by applying implicit casts, but this is dangerous because it can change comparison semantics and prevent index usage.
When a cast is applied to a column, indexes often become unusable, leading to full scans and poor performance.
To avoid this, schemas and queries must use consistent data types and apply casts only to constants, not columns.”

Posted in Без рубрики | Comments Off on SQL.How does SQL handle comparisons between different data types (for example, string vs number), and why is this dangerous?

SQL.What is the difference between DATE, TIME, TIMESTAMP, and TIMESTAMPTZ, and which one should be used for business events?

Short, interview-ready answer

DATE stores a calendar day, TIME stores a time of day, TIMESTAMP stores a date and time without timezone, and TIMESTAMPTZ stores an absolute moment in time.
For business events, TIMESTAMPTZ should almost always be used.


1️⃣ DATE

What it stores

  • Calendar date only
  • No time, no timezone
DATE '2026-01-28'

Use cases

  • Birthdays
  • Due dates
  • Business days
  • Holidays

Not for

❌ Events with ordering or exact time

2️⃣ TIME

What it stores

  • Time of day
  • No date, no timezone
TIME '14:30:00'

Use cases

  • Store opening hours
  • Schedules (“opens at 9:00”)

Not for

❌ Events
❌ Logging
❌ Auditing

3️⃣ TIMESTAMP (without timezone)

What it stores

  • Date + time
  • No timezone information
  • Interpreted in session timezone
TIMESTAMP '2026-01-28 14:30:00'

The trap

  • Ambiguous across timezones
  • DST changes break assumptions

Example:

"2026-03-29 02:30"  -- may not exist in some timezones

Use cases

  • Local wall-clock time
  • Legacy systems
  • Non-global apps (rare today)

4️⃣ TIMESTAMPTZ (timestamp with time zone)

What it really is (important!)

An absolute point in time stored internally in UTC.

  • Timezone normalized on write
  • Converted on read
TIMESTAMPTZ '2026-01-28 14:30:00+03'

Stored as:

2026-01-28 11:30:00 UTC

Use cases

✅ Business events
✅ Logs
✅ Audits
✅ Payments
✅ Distributed systems

TypeDateTimeTimezoneMeaning
DATECalendar day
TIMETime of day
TIMESTAMPLocal datetime
TIMESTAMPTZAbsolute instant

6️⃣ Which one for business events?

✅ Correct answer

Use TIMESTAMPTZ.

Why:

  • Business events happen at a real moment
  • Systems are distributed
  • Users are in different timezones
  • DST exists

Example (correct modeling)

created_at TIMESTAMPTZ NOT NULL DEFAULT now()

What about reporting by day?

Still use TIMESTAMPTZ, then derive:

DATE(created_at AT TIME ZONE 'Europe/Berlin')

Never store the derived value as truth.

7️⃣ Interview-ready final answer (clean)

DATE and TIME store partial temporal information, TIMESTAMP stores a local date-time without timezone, and TIMESTAMPTZ represents an absolute instant normalized to UTC.
For business events and distributed systems, TIMESTAMPTZ is the correct choice because it avoids ambiguity and timezone-related bugs.”


Senior red flags 🚩

If someone says:

  • “TIMESTAMP is fine everywhere”
  • “We store local time and convert later”
  • “Timezone is frontend problem”
Posted in Без рубрики | Comments Off on SQL.What is the difference between DATE, TIME, TIMESTAMP, and TIMESTAMPTZ, and which one should be used for business events?

SQL.How do numeric types like INTEGER, BIGINT, DECIMAL, and FLOAT differ in terms of precision and use cases?

Short, interview-ready answer

INTEGER and BIGINT are exact whole numbers, DECIMAL is exact for fractional values, and FLOAT is approximate and should never be used for money.
The choice affects correctness, not just storage.

1️⃣ INTEGER / BIGINT — exact whole numbers

What they are

  • Fixed-precision
  • Store exact values
  • No rounding
INTEGER   -- usually 32-bit
BIGINT    -- usually 64-bit

Typical ranges

  • INTEGER: ~ ±2 billion
  • BIGINT: ~ ±9 quintillion

Use cases

✅ IDs, counters, quantities
✅ Status codes
✅ Amounts in minor units (cents)

Example

balance_cents BIGINT

Interview insight

If money has no fractions → integers are perfect.

2️⃣ DECIMAL(p, s) / NUMERIC — exact decimals

What it is

  • Arbitrary precision
  • Exact representation of decimals
  • Slower, but correct
DECIMAL(10, 2)

p = total digits

s = digits after decimal point

Use cases

✅ Money
✅ Financial calculations
✅ Rates, percentages, measurements where correctness matters

Example





price DECIMAL(10,2)

Why it matters

0.1 + 0.2 = 0.3  -- TRUE with DECIMAL

3️⃣ FLOAT / REAL / DOUBLE — approximate numbers 🚨

What they are

  • Floating-point (IEEE-754)
  • Approximate
  • Fast, hardware-supported
REAL
DOUBLE PRECISION

The classic trap

0.1 + 0.2 = 0.30000000000000004

Use cases

✅ Scientific data
✅ Metrics
✅ Sensor data
✅ ML / statistics
❌ Money ❌

Interview rule

Never use FLOAT for financial data.

4️⃣ Precision comparison (must remember)

TypeExact?Fractional?Typical use
INTEGERIDs, counters
BIGINTLarge counters, money in cents
DECIMALMoney, finance
FLOATScientific, analytics

5️⃣ Performance vs correctness trade-off

  • INTEGER / BIGINT → fastest, exact
  • DECIMAL → slower, exact
  • FLOAT → fastest for math, inexact

Senior mindset:

Correctness first, performance second.


6️⃣ Common real-world mistakes 🚩

❌ Using FLOAT for prices
❌ Mixing DECIMAL and FLOAT in calculations
❌ Using DECIMAL without defining scale
❌ Storing money as DOUBLE “because Java double”


7️⃣ PostgreSQL-specific note (bonus)

  • NUMERIC == DECIMAL
  • Unlimited precision (bounded by memory)
  • Safe for finance, but avoid unnecessary precision

Interview-ready final answer (clean)

INTEGER and BIGINT store exact whole numbers and are ideal for IDs and counters.
DECIMAL stores exact fractional values and is the correct choice for money and financial data.
FLOAT uses approximate representation and should be used only where small rounding errors are acceptable, such as scientific or statistical data.”

Posted in Без рубрики | Comments Off on SQL.How do numeric types like INTEGER, BIGINT, DECIMAL, and FLOAT differ in terms of precision and use cases?

SQL.What problems can arise from implicit type casting in SQL, and how can it affect indexes?

Short, interview-ready answer

Implicit type casting can change query semantics, hide bugs, and—most importantly—prevent index usage by forcing the database to apply functions to indexed columns, leading to full scans instead of index scans.

1️⃣ What is implicit type casting?

Implicit casting happens when SQL automatically converts one type to another so a comparison can be made.

-- column is INTEGER
WHERE user_id = '42'   -- string literal

The DB silently inserts a cast.

2️⃣ The biggest problem: indexes stop working 🚨

Example (classic)

CREATE INDEX idx_orders_user_id ON orders(user_id);

SELECT *
FROM orders
WHERE user_id = '42';

What really happens (simplified):

WHERE user_id = CAST('42' AS INTEGER)

✅ Index can still be used in some DBs.

But this one is deadly:

SELECT *
FROM orders
WHERE CAST(user_id AS TEXT) = '42';

Execution plan:

Seq Scan on orders

❌ Index is not usable because the column is wrapped in a function.

Rule (must remember)

If the indexed column is on the left side of a function or cast, the index cannot be used.

3️⃣ Real-world examples that kill performance

❌ Comparing DATE to TIMESTAMP

WHERE created_at = '2024-01-01'

Becomes:

WHERE created_at = TIMESTAMP '2024-01-01 00:00:00'

Almost never matches what you expect.

Better:

WHERE created_at >= DATE '2024-01-01'
  AND created_at <  DATE '2024-01-02'

❌ Numeric vs string comparison

WHERE amount = '100'

Works

But hides schema bugs

Can break if locale or format changes

❌ JOINs with mismatched types (very common)

JOIN users u ON o.user_id = u.external_id

Where:

  • user_id = INT
  • external_id = VARCHAR

Result:

  • Cast on one side
  • Index on one column ignored
  • Join becomes much more expensive

4️⃣ Implicit casts can change results (not just performance)

Example: string → number

WHERE price > '100'

Depending on DB:

  • numeric comparison
  • or lexical comparison ('100' < '20'!)

This is data corruption waiting to happen.

5️⃣ How this affects JOIN algorithms

Implicit casting can:

  • disable index nested loop joins
  • force hash joins or full scans
  • increase memory usage
  • change join order

So a tiny type mismatch can:

turn a millisecond OLTP query into a seconds-long query.


6️⃣ How to avoid implicit casting (best practices)

✅ Always compare same types

WHERE user_id = 42

✅ Use typed literals

DATE '2024-01-01'
TIMESTAMP '2024-01-01 10:00:00'

✅ Fix schema mismatches

  • FK and PK must have identical types
  • No INTVARCHAR joins

✅ Cast constants, not columns

WHERE user_id = CAST(? AS INTEGER)

❌ Never:

WHERE CAST(user_id AS TEXT) = ?

7️⃣ Interview-ready final answer (clean)

“Implicit type casting can hide bugs and cause serious performance issues by preventing index usage.
When a cast or function is applied to an indexed column, the optimizer can’t use the index, often resulting in full scans or suboptimal join algorithms.
To avoid this, schemas and comparisons must use consistent data types and casts should be applied to constants, not columns.”

Senior red flags 🚩

If someone says:

  • “Postgres handles casts automatically”
  • “Indexes still work with casts”
  • “It doesn’t matter if types differ”

👉 That’s mid-level thinking.

Posted in Без рубрики | Comments Off on SQL.What problems can arise from implicit type casting in SQL, and how can it affect indexes?

SQL.How does NULL differ from 0, an empty string, or FALSE in SQL?

Short, interview-ready answer

NULL means “unknown or not applicable”, while 0, empty string (''), and FALSE are known, concrete values.
NULL propagates through expressions and comparisons, whereas the others behave like normal values.


Core semantic difference

NULL

  • Represents absence of a value
  • Not equal to anything — not even another NULL
  • Causes expressions to evaluate to UNKNOWN

0

  • A numeric value
  • Fully comparable and calculable

'' (empty string)

  • A string of length 0
  • Still a valid value

FALSE

  • A boolean value
  • Known and comparable

Comparison behavior (this is the trap)

NULL = NULL        → UNKNOWN
NULL <> NULL       → UNKNOWN
NULL = 0           → UNKNOWN
NULL = ''          → UNKNOWN

but

0 = 0              → TRUE
'' = ''            → TRUE
FALSE = FALSE      → TRUE

Consequence

WHERE column = NULL

Correct

WHERE column IS NULL

Logical expressions (three-valued logic)

TRUE  AND NULL  → UNKNOWN
FALSE AND NULL  → FALSE
TRUE  OR  NULL  → TRUE
FALSE OR  NULL  → UNKNOWN
NOT NULL        → UNKNOWN

WHERE keeps only TRUE, so UNKNOWN rows are filtered out.

Aggregations and NULLs

COUNT(column)   -- ignores NULL
COUNT(*)        -- counts rows
SUM(column)     -- ignores NULL

But:

AVG(NULL) → NULL

This surprises many candidates.

NULL in constraints

  • UNIQUE allows multiple NULLs (in most DBs)
  • NOT NULL enforces presence
  • CHECK (col > 0) → fails if col is NULL (because result is UNKNOWN)

Why confusing NULL with 0/”/FALSE is dangerous

Bad design example





balance INT DEFAULT 0

Does 0 mean:

  • truly zero?
  • unknown?
  • not loaded yet?

Correct modeling:

  • Use NULL for “unknown”
  • Use real values for real meaning

Interview-ready final answer (clean)

NULL represents the absence of a value, not a value itself, while 0, empty string, and FALSE are concrete values.
Comparisons with NULL evaluate to UNKNOWN due to SQL’s three-valued logic, which is why IS NULL must be used instead of =.
Misusing NULL leads to subtle filtering and aggregation bugs.”


Senior red flags 🚩

If a candidate says:

  • “NULL is just another value”
  • “NULL equals NULL”
  • “I use 0 instead of NULL”
Posted in Без рубрики | Comments Off on SQL.How does NULL differ from 0, an empty string, or FALSE in SQL?

SQL.What is the difference between CHAR, VARCHAR, and TEXT, and when would you choose each?

Short answer (what to say first)

CHAR is fixed-length and space-padded, VARCHAR is variable-length with a limit, and TEXT is variable-length without a defined limit.
In practice, VARCHAR is the default choice, CHAR is for truly fixed-size values, and TEXT is for large or unbounded content.

1️⃣ CHAR(n)

What it is

  • Fixed-length string
  • Always uses n characters
  • Right-padded with spaces
CHAR(10)

Example

'abc' → 'abc       '

When to use

  • Truly fixed-size data:
    • ISO country codes (CHAR(2))
    • Fixed flags
    • Legacy schemas

Pros / Cons

✅ Predictable size
❌ Wastes space
❌ Easy to misuse

Interview note

CHAR rarely improves performance in modern DBs.

2️⃣ VARCHAR(n)

What it is

  • Variable-length string
  • Enforced maximum length
  • No padding
VARCHAR(255)

Example

'abc' → 'abc'

When to use (most cases)

  • Names
  • Emails
  • Codes with max length
  • API input validation at DB level

Pros / Cons

✅ No wasted space
✅ Length constraint protects data
❌ Slight length overhead (negligible)

Interview note

Default choice for application data.

3️⃣ TEXT

What it is

  • Variable-length
  • No declared limit (technically very large)
  • Stored like other varlena types in Postgres
TEXT

When to use

  • Long descriptions
  • JSON blobs (sometimes)
  • Logs
  • Comments / messages

Pros / Cons

✅ No artificial limit
❌ Harder to reason about size
❌ Sometimes discouraged in strict schemas

Interview note

TEXT and VARCHAR have similar performance in PostgreSQL.

TypeFixedMax lengthPaddingTypical use
CHAR(n)YesYesCountry codes, legacy
VARCHAR(n)YesNoMost app fields
TEXTNoNoLarge free text

5️⃣ Performance & indexing (important)

  • Postgres treats TEXT and VARCHAR almost identically
  • Index performance is the same
  • Length limit is a semantic constraint, not a performance one

Bad myth:

VARCHAR(255) is faster than TEXT

❌ False in Postgres.


6️⃣ When NOT to use each

❌ Don’t use CHAR(n) for names or emails
❌ Don’t use TEXT when business rules require a max length
❌ Don’t pick VARCHAR(255) blindly “because everyone does”


7️⃣ Interview-ready final answer (2–3 sentences)

CHAR is fixed-length and space-padded, suitable only for truly fixed-size values.
VARCHAR is variable-length with a maximum and is the default choice for most application fields.
TEXT has no declared limit and is used for large or unbounded content; in PostgreSQL it performs similarly to VARCHAR.”


Senior bonus insight ⭐

In well-designed schemas:

  • Length limits express business constraints
  • Not performance tuning
Posted in Без рубрики | Comments Off on SQL.What is the difference between CHAR, VARCHAR, and TEXT, and when would you choose each?

SQL.What is MVCC ?

MVCC is a concurrency control mechanism where the database keeps multiple versions of a row, allowing readers and writers to proceed without blocking each other.


Why MVCC exists (one line)

To avoid read locks blocking writes (and vice versa) while preserving transactional isolation.


How MVCC works (PostgreSQL intuition)

  • UPDATE / DELETE does not overwrite a row
  • A new version is created
  • Old versions remain visible to older transactions
  • Visibility is decided by a transaction snapshot

So:

  • Reader sees a consistent snapshot
  • Writer creates a new row version

No read locks.

Simple example

T1: SELECT balance FROM accounts;   → sees version V1
T2: UPDATE accounts SET balance=90; → creates version V2
T1: still sees V1
T3: sees V2

What MVCC gives you

✅ Readers don’t block writers
✅ Writers don’t block readers
✅ High concurrency (OLTP-friendly)
✅ Consistent reads inside a transaction


The cost of MVCC (important!)

❌ Dead tuples accumulate
❌ Requires VACUUM / AutoVacuum
❌ Long-running transactions prevent cleanup
❌ Storage bloat if mismanaged

This is why autovacuum is critical.


MVCC & isolation levels (Postgres)

  • READ COMMITTED → snapshot per statement
  • REPEATABLE READ → snapshot per transaction
  • SERIALIZABLE → MVCC + conflict detection

Interview-ready one-liner

MVCC allows multiple versions of rows so readers and writers don’t block each other, using transaction snapshots to decide which version is visible.

Posted in Без рубрики | Comments Off on SQL.What is MVCC ?

SQL.What is AutoVacuum Postgres ?

Autovacuum is PostgreSQL’s background process that cleans up dead tuples created by MVCC and refreshes statistics, so tables don’t bloat and the optimizer stays accurate.

Why it exists (one line)

Postgres uses MVCC, so updates/deletes create dead rows; autovacuum removes them and updates stats automatically.

What AutoVacuum does

  • VACUUM: removes dead tuples, frees space for reuse
  • ANALYZE: updates table/column statistics for the optimizer
  • Prevents transaction ID wraparound (critical safety task)

When it runs

Autovacuum triggers per table when changes exceed thresholds:

  • autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * table_size
  • autovacuum_analyze_threshold + autovacuum_analyze_scale_factor * table_size

(Defaults work for many cases; hot tables often need tuning.)


Why backend engineers care

  • Disabled/misconfigured autovacuum ⇒ table bloat, slow queries
  • Stale stats ⇒ bad plans (wrong joins/scans)
  • OLTP systems rely on it for steady performance

Interview-ready sentence

Autovacuum is PostgreSQL’s background worker that reclaims space from dead rows and keeps optimizer statistics fresh to maintain performance and prevent bloat.

Posted in Без рубрики | Comments Off on SQL.What is AutoVacuum Postgres ?

SQL.What is eventual consistency ?

Eventual consistency means that if no new updates are made, all replicas will eventually converge to the same state, but reads may temporarily see stale data.

Key phrase to remember:

Consistency is delayed, not lost.


Why eventual consistency exists

Strong consistency across nodes requires:

  • distributed locks
  • synchronous replication
  • blocking on network failures

That kills:

  • latency
  • availability
  • scalability

Eventual consistency trades immediate correctness for:

  • high availability
  • low latency
  • fault tolerance

Concrete example (classic)

Bank balance replicated to two regions

Client → Region A → balance = 100 → 90 (withdraw)
                 ↘
                  Region B still sees balance = 100

For a short time:

  • Region A: 90
  • Region B: 100

Later:

  • replication catches up
  • both show 90

That window = eventual consistency gap

Where eventual consistency is used

  • Distributed databases (Cassandra, DynamoDB)
  • Caches (Redis replicas)
  • Search indexes
  • OLAP replicas
  • Microservices with async messaging
  • CDC-based pipelines

Eventual vs Strong consistency

AspectStrong ConsistencyEventual Consistency
Read after writeAlways latestMight be stale
AvailabilityLowerHigher
LatencyHigherLower
Partition toleranceLimitedHigh
ComplexityLower for appHigher for app

Eventual consistency & CAP theorem

CAP says:

  • Consistency
  • Availability
  • Partition tolerance

In distributed systems:

You must choose between C and A when a partition happens.

Eventual consistency is typically:

  • AP system

What “eventual” really implies (important!)

It does NOT mean:

  • random
  • inconsistent forever
  • unreliable

It DOES mean:

  • ordering may differ temporarily
  • duplicates may appear
  • reads may lag behind writes

Backend implications (very important)

1️⃣ You must design for it

Assume:

  • duplicate events
  • out-of-order messages
  • delayed visibility

Therefore:

  • idempotent writes
  • versioning / timestamps
  • retry-safe operations

2️⃣ User-facing semantics matter

Examples:

  • “Your balance may update in a few seconds”
  • “Report data is delayed by up to 5 minutes”

Clear contracts > false guarantees.


3️⃣ Common consistency patterns

  • Read-your-writes (session consistency)
  • Monotonic reads
  • Causal consistency
  • Compensating actions

Eventual consistency has levels, not just one mode.


Eventual consistency in Kafka / CDC pipelines

Your earlier architecture:

  • Kafka → OLTP → CDC → OLAP

This is eventually consistent:

  • Transaction visible in OLTP immediately
  • Appears in OLAP after some delay

That’s fine for:

  • reports
  • dashboards
  • analytics

But NOT fine for:

  • real-time balance checks
  • authorization decisions

Interview-ready short answer

“Eventual consistency means that replicas may temporarily diverge, but if no new updates occur, they will converge to the same state. It’s a trade-off that improves availability and scalability at the cost of immediate consistency, and it requires applications to handle stale reads and retries.”

Posted in Без рубрики | Comments Off on SQL.What is eventual consistency ?