Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Log-Based Replication to Support Transactional Records #119

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

sgandhi1311
Copy link
Member

@sgandhi1311 sgandhi1311 commented Dec 9, 2024

Description of change

This PR updates the log-based replication logic in the tap-mongodb connector to capture and process transaction-based records. Previously, the logic was limited to operations directly targeting the specified collection. With this enhancement, transactional operations that modify multiple collections (and are logged as applyOps entries in the oplog) are now captured and handled effectively.

Initial Oplog query -

oplog_query = {
    'ts': {'$gte': start_ts},
    'ns': 'database_name.collection_name'
}

Updated Oplog query -

oplog_query = {
    '$and': [
        {'ts': {'$gte': start_ts}},
        {
            '$or': [
                {'ns': 'database_name.collection_name'},
                {'op': 'c', 'o.applyOps.ns': 'database_name.collection_name'}
            ]
        }
    ]
}

QA steps

  • automated tests passing
  • Verified the changes using both transactional and non-transactional operations:
  • Single-document operations (insert, update, delete) were correctly captured.
  • Multi-document transactional updates and inserts were processed successfully.

Test Scenario

  1. Start a Transaction:
  • Opened a session and began a transaction.
  1. Write Data Within the Transaction:
  • Inserted a record into <database_name.collection_name>.
  1. Write Data Outside the Transaction:
  • Performed an independent insert operation in <database_name.collection_name>
  1. Write Additional Data Within the Transaction:
  • Updated a record in <database_name.collection_name>
  1. Commit the Transaction:
  • Closed the transaction by committing it.

Observations

Oplog Behavior:

  • Records related to the transaction were not written to the oplog until the transaction was committed.
  • Once committed, all operations within the transaction appeared as part of a single applyOps entry under the admin.$cmd namespace.

Sequence Maintenance:

  • Transactional records were applied in sequence, maintaining consistency.
  • Non-transactional operations were recorded separately in real-time, outside the transactional context.

Risks

  • The extraction process may experience increased duration compared to previous runs due to the additional complexity of processing transaction-based records.

Rollback steps

  • revert this branch

AI generated code

https://internal.qlik.dev/general/ways-of-working/code-reviews/#guidelines-for-ai-generated-code

  • this PR has been written with the help of GitHub Copilot or another generative AI tool

@sgandhi1311 sgandhi1311 changed the title Capture transaction based records for the log based replication Enhance Log-Based Replication to Support Transactional Records Dec 9, 2024
@sgandhi1311 sgandhi1311 force-pushed the TDL-13583-capture-transaction-based-records branch from 27608c5 to 144f0b1 Compare December 9, 2024 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants