Known issues and limitations v6.0.0
Known issues
These are currently known issues in EDB Postgres Distributed 6. These known issues are tracked in PGD's ticketing system and are expected to be resolved in a future release.
If the resolver for the
update_origin_change
conflict is set toskip
,synchronous_commit=remote_apply
is used, and concurrent updates of the same row are repeatedly applied on two different nodes, then one of the update statements might hang due to a deadlock with the PGD writer. As mentioned in Conflicts,skip
isn't the default resolver for theupdate_origin_change
conflict, and this combination isn't intended to be used in production. It discards one of the two conflicting updates based on the order of arrival on that node, which is likely to cause a divergent cluster. In the rare situation that you do choose to use theskip
conflict resolver, note the issue with the use of theremote_apply
mode.The Decoding Worker feature doesn't work with CAMO/Eager/Group Commit. Installations using CAMO/Eager/Group Commit must keep
enable_wal_decoder
disabled.Lag Control doesn't adjust commit delay in any way on a fully isolated node, that's in case all other nodes are unreachable or not operational. As soon as at least one node connects, replication Lag Control picks up its work and adjusts the PGD commit delay again.
For time-based Lag Control, PGD currently uses the lag time, measured by commit timestamps, rather than the estimated catch up time that's based on historic apply rates.
Changing the CAMO partners in a CAMO pair isn't currently possible. It's possible only to add or remove a pair. Adding or removing a pair doesn't require a restart of Postgres or even a reload of the configuration.
Group Commit can't be combined with CAMO.
Transactions using Eager Replication can't yet execute DDL. The TRUNCATE command is allowed.
Parallel Apply isn't currently supported in combination with Group Commit. Make sure to disable it when using Group Commit by either (a) Setting
num_writers
to 1 for the node group usingbdr.alter_node_group_option
or (b) using the GUCbdr.writers_per_subscription
. See Configuration of generic replication.There currently is no protection against altering or removing a commit scope. Running transactions in a commit scope that's concurrently being altered or removed can lead to the transaction blocking or replication stalling completely due to an error on the downstream node attempting to apply the transaction. Make sure that any transactions using a specific commit scope have finished before altering or removing it.
The PGD CLI can return stale data on the state of the cluster if it's still connecting to nodes that were previously parted from the cluster. Edit the
pgd-cli-config.yml
file, or change your--dsn
settings to ensure only active nodes in the cluster are listed for connection.
To modify a commit scope safely, use bdr.alter_commit_scope
.
DDL run in serializable transactions can face the error:
ERROR: could not serialize access due to read/write dependencies among transactions
. A workaround is to run the DDL outside serializable transactions.The EBD Postgres Advanced Server 17 data type
BFILE
is not currently supported. This is due toBFILE
being a file reference that is stored in the database, and the file itself is stored outside the database and not replicated.EDB Postgres Advanced Server's native autopartioning is not supported in PGD. See Restrictions on EDB Postgres Advanced Server-native automatic partitioning for more information.
Limitations
Take these EDB Postgres Distributed (PGD) design limitations into account when planning your deployment.
Nodes
PGD can run hundreds of nodes, assuming adequate hardware and network. However, for mesh-based deployments, we generally don’t recommend running more than 48 nodes in one cluster. If you need extra read scalability beyond the 48-node limit, you can add subscriber-only nodes without adding connections to the mesh network.
The minimum recommended number of nodes in a group is three to provide fault tolerance for PGD's consensus mechanism. With just two nodes, consensus would fail if one of the nodes were unresponsive. Consensus is required for some PGD operations, such as distributed sequence generation. For more information about the consensus mechanism used by EDB Postgres Distributed, see Architectural details.
Multiple databases on single instances
Support for using PGD for multiple databases on the same Postgres instance is deprecated beginning with PGD 5 and will no longer be supported with PGD 6. As we extend the capabilities of the product, the added complexity introduced operationally and functionally is no longer viable in a multi-database design.
It's best practice and we recommend that you configure only one database per PGD instance.
The deployment automation with TPA and the tooling such as the CLI and Connection Manager already codify that recommendation.
While it's still possible to host up to 10 databases in a single instance, doing so incurs many immediate risks and current limitations:
If PGD configuration changes are needed, you must execute administrative commands for each database. Doing so increases the risk for potential inconsistencies and errors.
You must monitor each database separately, adding overhead.
TPAexec assumes one database. Additional coding is needed by customers or by the EDB Professional Services team in a post-deploy hook to set up replication for more databases.
Connection Manager works at the Postgres instance level, not at the database level, meaning the leader node is the same for all databases.
Each additional database increases the resource requirements on the server. Each one needs its own set of worker processes maintaining replication, for example, logical workers, WAL senders, and WAL receivers. Each one also needs its own set of connections to other instances in the replication cluster. These needs might severely impact performance of all databases.
Synchronous replication methods, for example, CAMO and Group Commit, won’t work as expected. Since the Postgres WAL is shared between the databases, a synchronous commit confirmation can come from any database, not necessarily in the right order of commits.
CLI integration assumes one database.
Durability options (Group Commit/CAMO)
There are various limits on how the PGD durability options work. These limitations are a product of the interactions between Group Commit and CAMO, and how they interact with PGD features such as the WAL decoder and transaction streaming.
Also, there are limitations on interoperability with legacy synchronous replication, interoperability with explicit two-phase commit, and unsupported combinations within commit scope rules.
The following limitations apply to the use of commit scopes and the various durability options they enable.
General durability limitations
Legacy synchronous replication uses a mechanism for transaction confirmation different from the one used by CAMO, Eager, and Group Commit. The two aren't compatible, so don't use them together. Whenever you use Group Commit, CAMO, or Eager, make sure none of the PGD nodes are configured in
synchronous_standby_names
.Postgres two-phase commit (2PC) transactions (that is,
PREPARE TRANSACTION
) can't be used with CAMO, Group Commit, or Eager because those features use two-phase commit underneath.
Group Commit
Group Commit enables configurable synchronous commits over nodes in a group. If you use this feature, take the following limitations into account:
- Not all DDL can run when you use Group Commit. If you use unsupported DDL, a warning is logged, and the transactions commit scope is set to local. The only supported DDL operations are:
- Nonconcurrent
CREATE INDEX
- Nonconcurrent
DROP INDEX
- Nonconcurrent
REINDEX
of an individual table or index CLUSTER
(of a single relation or index only)ANALYZE
TRUNCATE
- Nonconcurrent
Explicit two-phase commit isn't supported by Group Commit as it already uses two-phase commit.
Combining different commit decision options in the same transaction or combining different conflict resolution options in the same transaction isn't supported.
Currently, Raft commit decisions are extremely slow, producing very low TPS. We recommended using them only with the
eager
conflict resolution setting to get the Eager All-Node Replication behavior of PGD 4 and older.
Eager
Eager is available through Group Commit. It avoids conflicts by eagerly aborting transactions that might clash. It's subject to the same limitations as Group Commit.
Eager doesn't allow the NOTIFY
SQL command or the pg_notify()
function. It
also doesn't allow LISTEN
or UNLISTEN
.
CAMO
Commit At Most Once (CAMO) is a feature that aims to prevent applications committing more than once. If you use this feature, take these limitations into account when planning:
CAMO is designed to query the results of a recently failed COMMIT on the origin node. In case of disconnection, the application must request the transaction status from the CAMO partner. Ensure that you have as little delay as possible after the failure before requesting the status. Applications must not rely on CAMO decisions being stored for longer than 15 minutes.
If the application forgets the global identifier assigned, for example, as a result of a restart, there's no easy way to recover it. Therefore, we recommend that applications wait for outstanding transactions to end before shutting down.
For the client to apply proper checks, a transaction protected by CAMO can't be a single statement with implicit transaction control. You also can't use CAMO with a transaction-controlling procedure or in a
DO
block that tries to start or end transactions.CAMO resolves commit status but doesn't resolve pending notifications on commit. CAMO doesn't allow the
NOTIFY
SQL command or thepg_notify()
function. They also don't allowLISTEN
orUNLISTEN
.When replaying changes, CAMO transactions might detect conflicts just the same as other transactions. If timestamp-conflict detection is used, the CAMO transaction uses the timestamp of the prepare-on-the-origin node, which is before the transaction becomes visible on the origin node itself.
CAMO isn't currently compatible with transaction streaming. Be sure to disable transaction streaming when planning to use CAMO. You can configure this option globally or in the PGD node group. See Transaction streaming configuration.
CAMO isn't currently compatible with decoding worker. Be sure to not enable decoding worker when planning to use CAMO. You can configure this option in the PGD node group. See Decoding worker disabling.
Not all DDL can run when you use CAMO. If you use unsupported DDL, a warning is logged and the transactions commit scope is set to local only. The only supported DDL operations are:
- Nonconcurrent
CREATE INDEX
- Nonconcurrent
DROP INDEX
- Nonconcurrent
REINDEX
of an individual table or index CLUSTER
(of a single relation or index only)ANALYZE
TRUNCATE
- Nonconcurrent
Explicit two-phase commit isn't supported by CAMO as it already uses two-phase commit.
You can combine only CAMO transactions with the
DEGRADE TO
clause for switching to asynchronous operation in case of lowered availability.
Mixed PGD versions
PGD was developed to enable rolling upgrades of PGD by allowing mixed versions of PGD to operate during the upgrade process. We expect users to run mixed versions only during upgrades and, once an upgrade starts, that they complete that upgrade. We don't support running mixed versions of PGD except during an upgrade.
Other limitations
This noncomprehensive list includes other limitations that are expected and are by design. We don't expect to resolve them in the future. Consider these limitations when planning your deployment:
- A
galloc
sequence might skip some chunks if you create the sequence in a rolled back transaction and then create it again with the same name. Skipping chunks can also occur if you create and drop the sequence when DDL replication isn't active and then you create it again when DDL replication is active. The impact of the problem is mild because the sequence guarantees aren't violated. The sequence skips only some initial chunks. Also, as a workaround, you can specify the starting value for the sequence as an argument to thebdr.alter_sequence_set_kind()
function.
- On this page
- Known issues
- Limitations
- CAMO