Should I set bulk-refresh=wait_for globally?

No. Globally it serializes throughput behind index refresh and multiplies thread contention on the search cluster write pool. Apply it selectively to writes whose immediate visibility is required and leave bulk ingestion on false.

Eventual vs Strong Consistency

The choice between eventual and strong consistency in a JanusGraph deployment is not a philosophical stance — it is a per-write configuration decision that determines whether a traversal can see its own commit. It exists because JanusGraph decouples transactional storage (Cassandra, ScyllaDB, HBase) from mixed-index engines (Elasticsearch, OpenSearch), and that split opens a replication window in which index state trails committed vertex and edge mutations. This page sits under External Index Synchronization & Consistency Tuning and narrows the parent’s tuning surface to a single question: where do you place the acknowledgment boundary, and what does each placement cost in latency, throughput, and failure blast radius. Get it wrong globally and you either serialize every write behind an index refresh or you serve stale search results to a fraud-detection query; get it right selectively and you pay the strong-consistency tax only on the writes that genuinely need it.

The sequence below contrasts the two models: eventual consistency acknowledges before indexing, strong consistency blocks until the index confirms.

Eventual acknowledges after the storage commit and indexes asynchronously; strong blocks the commit until the index confirms — the same messages, a different acknowledgment boundary.

Strong consistency buys immediate visibility on one axis and pays for it on the other three — so make it the exception, not the global default.

The Dual-Write Path and Where the Boundary Lives

JanusGraph executes a two-phase write. The primary transaction commits synchronously to the storage backend at the configured consistency level; index mutations are then serialized and dispatched to the search cluster through the IndexProvider interface. The storage ack is the point of no return — once Cassandra or ScyllaDB confirms at QUORUM, the mutation is authoritative regardless of index state. The acknowledgment boundary is simply the question of when JanusGraph returns control to your client: at the storage ack (eventual) or after the index shard is searchable (strong).

Two models fall out of that single decision:

Eventual consistency — the default. Index updates propagate asynchronously via background workers; a traversal routed through the mixed index may return stale or missing results until the next refresh cycle. This maximizes ingestion throughput and minimizes commit latency, and it is the correct default for bulk pipelines, analytical scans, and any workload that does not read its own writes within the replication window.
Strong consistency — opt-in per write. The transaction blocks until the index backend acknowledges the document mutation (bulk-refresh=wait_for). It guarantees immediate visibility but increases write latency, reduces throughput under load, and amplifies failure cascades when an index node is slow or partitioned, because your storage commit is now hostage to index health.

Storage consistency operates independently of index consistency, and conflating the two is the most common tuning error on this boundary. QUORUM on the CQL layer governs durability and read repair inside the storage cluster; it says nothing about index visibility. Raising the storage level does not shrink the replication window — it only adds write latency. The decision framework for matching each model to a concrete latency budget is worked through in Eventual vs Strong Consistency Tradeoffs in JanusGraph; this page covers the configuration and code that implement whichever side of the boundary you land on.

Core Configuration & Consistency Tuning

Tuning happens through explicit property overrides in janusgraph.properties. The block below keeps the async pipeline intact for throughput while enabling near-synchronous visibility on demand. Every non-default line changes a failure mode.

properties

# --- Storage backend consistency (Cassandra / ScyllaDB) ---
# QUORUM prevents stale reads from lagging replicas; match it to your
# replication factor. Use LOCAL_QUORUM for multi-region to avoid cross-DC latency.
storage.backend=cql
storage.cql.read-consistency-level=QUORUM
storage.cql.write-consistency-level=QUORUM
# Cap batch size so a large mutation cannot exceed the CQL frame limit.
storage.cql.batch-statement-size=50

# --- Mixed index backend (the 'elasticsearch' value also drives OpenSearch) ---
index.search.backend=elasticsearch
index.search.hostname=es-cluster.internal
index.search.port=9200
index.search.elasticsearch.client-only=true

# --- The acknowledgment boundary ---
# wait_for = block the commit until the mutated shards are searchable.
# false    = fire-and-forget, rely on refresh_interval (max throughput).
index.search.elasticsearch.bulk-refresh=wait_for

# --- Refresh cadence (fixed at index creation via create.ext.*) ---
# Lower refresh_interval shrinks the eventual window but raises segment-merge I/O.
index.search.elasticsearch.create.ext.refresh_interval=1s

Setting bulk-refresh=wait_for forces JanusGraph to block until the search cluster acknowledges each bulk request. For Elasticsearch deployments, the official refresh interval documentation explains how to balance I/O overhead against query freshness. When the target is OpenSearch, the same elasticsearch backend value and property keys apply, but cluster coordination differs slightly — review OpenSearch Sync Patterns for node-level routing adjustments, and Elasticsearch Integration for the transport and auth wiring both backends share.

Operational constraints when moving this boundary:

Never enable wait_for globally under sustained ingestion. It serializes throughput behind index refresh and multiplies thread contention on the search cluster’s write pool. Apply it only to the writes whose immediate visibility is a business requirement.
Match read-consistency-level to your replication factor. With RF=3, QUORUM (two replicas) survives one node loss without serving stale reads. LOCAL_ONE is acceptable only for traversal-heavy read paths that tolerate the eventual window.
Treat refresh_interval as the eventual floor. At 1s, visibility cannot be faster than roughly one second even when the queue and transport are idle. Do not chase sub-second freshness by lowering it below 1s — you will thrash segment merges long before you meaningfully shrink the window.
Align the routing partition before you tighten the boundary. A hot shard makes wait_for far more expensive because the blocking commit waits on the slowest shard refresh; see Mixed Index Routing.

The Acknowledgment Boundary and the Replication Window

Under eventual consistency, the interval between the storage ack and the moment the index shard becomes searchable is the replication window. Its lower bound is the sum of the queue wait, the bulk transport, and the refresh interval:

W_{\text{drift}} = t_{\text{queue}} + t_{\text{bulk}} + t_{\text{refresh}}

With refresh_interval=1s, the t_refresh term alone pins the floor near one second regardless of how idle the pipeline is. Setting bulk-refresh=wait_for collapses t_refresh to near zero for the affected writes — but it does not remove t_queue or t_bulk, and it converts them from asynchronous background cost into synchronous commit latency you pay on the critical path. In practice, expect a 15–40 ms increase per batch under near-synchronous mode versus fire-and-forget.

There are three distinct ways to satisfy a read-your-writes requirement, in ascending order of cost:

Route around the index. If the read can be expressed as a storage-backed lookup — hasId() or has() on a composite-indexed property — it never touches the mixed index and the window is irrelevant. This is the cheapest option and should be the first thing you check.
Poll for visibility. Leave the boundary eventual and, after commit, poll the index for the specific document until it appears or a deadline elapses. This keeps ingestion throughput high and isolates the latency cost to the one reader that needs freshness.
Block at commit. Use bulk-refresh=wait_for (or an explicit _refresh at the transaction boundary) only for the narrow set of writes whose downstream read cannot tolerate the window at all.

To decide which applies, you must be able to measure the window. Track index lag as the delta between the storage commit timestamp and index visibility, and alert when it exceeds your read-your-writes budget. A monotonically rising IndexProvider queue is backpressure — the producer is outrunning the search cluster — while a rising bulk-flush duration means the index cluster itself is falling behind. These two signals distinguish “tighten the producer” from “scale the index,” and confusing them wastes an on-call cycle.

Python Integration: Selective Read-Your-Writes

Production pipelines need explicit transaction management, idempotent mutations, and a fallback that reconciles visibility instead of assuming it. The pattern below writes on the eventual path for throughput, then, for the small set of records that require it, polls the index until the document is searchable rather than blocking every commit. It uses gremlin-python and tenacity for resilient transport-level retry.

python

import asyncio
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.process.graph_traversal import __
from gremlin_python.driver.protocol import GremlinServerError
from tenacity import (
    retry, stop_after_attempt, wait_exponential, retry_if_exception_type,
)

RETRYABLE = (ConnectionError, GremlinServerError, asyncio.TimeoutError)


class ConsistencyAwarePipeline:
    def __init__(self, ws_url: str):
        self.ws_url = ws_url
        self.g = None
        self.conn = None

    async def connect(self):
        self.conn = DriverRemoteConnection(self.ws_url, 'g')
        self.g = traversal().withRemote(self.conn)

    @retry(
        stop=stop_after_attempt(3),
        wait=wait_exponential(multiplier=1, min=2, max=10),
        retry=retry_if_exception_type(RETRYABLE),
    )
    async def upsert_vertex(self, vertex_id: str, properties: dict):
        # Idempotent get-or-create keyed on the 'id' property so a replayed
        # batch updates rather than duplicating — duplicates would surface as
        # phantom index documents inside the replication window.
        t = self.g.V().has('person', 'id', vertex_id).fold().coalesce(
            __.unfold(),
            __.addV('person').property('id', vertex_id),
        )
        for k, val in properties.items():
            t = t.property(k, val)
        try:
            await t.promise(lambda trav: trav.next())
        except GremlinServerError as e:
            raise RuntimeError(f"Gremlin write failed for {vertex_id}: {e}") from e

    async def await_index_visibility(self, vertex_id: str,
                                     timeout: float = 5.0, interval: float = 0.25):
        """Read-your-writes fallback: poll the mixed index until the just-written
        vertex is searchable, bounded by a deadline. Cheaper than blocking every
        commit on bulk-refresh=wait_for when only some reads need freshness."""
        deadline = asyncio.get_event_loop().time() + timeout
        while True:
            # Force the query through the mixed index (textContains / range
            # predicate), not the storage-backed id lookup.
            found = await self.g.V().has('person', 'id', vertex_id) \
                .count().promise(lambda trav: trav.next())
            if found:
                return True
            if asyncio.get_event_loop().time() >= deadline:
                # Fallback: surface the miss so the caller can degrade
                # gracefully (retry later, read stale, or escalate) rather
                # than silently returning an empty result set.
                return False
            await asyncio.sleep(interval)

    async def close(self):
        if self.conn:
            await self.conn.close()

The @retry decorator handles transient network faults and Gremlin server timeouts with exponential backoff; explicit exception chaining preserves the stack trace for observability. The await_index_visibility method is the key discipline: it isolates the strong-consistency cost to callers that genuinely need it, leaving the bulk of ingestion on the fast eventual path. When you do need commit-time blocking instead of polling, keep bulk-refresh=wait_for scoped to that pipeline and align its client timeouts with the server-side refresh_interval documented in Elasticsearch Integration.

Connection Lifecycle & Pool Management

The acknowledgment boundary directly shapes pool sizing, because wait_for holds each connection for the full duration of the index refresh, not just the storage commit. A pool tuned for fire-and-forget throughput will exhaust under near-synchronous writes, and the resulting TimeoutException is indistinguishable from index lag unless you instrument pool utilization.

Size the pool to write concurrency, then add headroom for blocking. Under eventual consistency, size the driver pool to peak concurrent writers. Under wait_for, multiply by the ratio of (commit + refresh) latency to commit-only latency — a connection that used to turn over in 5 ms now holds for 25–45 ms, so the same throughput needs roughly 5–9× the connections.
Bound idle timeout below the server’s. Set the client idle timeout under the Gremlin Server’s idleConnectionTimeout so the client reaps dead sockets first; a client reusing a server-closed socket surfaces as a spurious GremlinServerError that the retry policy will burn attempts on.
Make the retry policy boundary-aware. Under wait_for, a retry re-runs a mutation that may have already committed to storage but not yet been acknowledged by the index. Idempotent mutations (mergeV/coalesce on a stable key, as above) are mandatory here — without them a retry double-writes, and the duplicate propagates into the index as a phantom document.
Cap batch concurrency, not just batch size. A burst of wait_for batches fanning out concurrently can saturate the index write pool and stall every blocking commit at once; a bounded concurrency semaphore keeps the blast radius contained.

The full sizing model — pool minimums, maximums, and starvation symptoms — lives in Connection Pooling. Align it with the keyspace-level Replication Strategies you deploy, because a QUORUM write that must reach a remote datacenter stretches the storage-commit term of the window and changes how long each pooled connection is held.

Diagnostics & Operational Fallbacks

Instrument both sides of the boundary and correlate them; drift shows up as divergence between two series before it shows up as a user complaint. Use nodetool tpstats to watch Cassandra/ScyllaDB mutation stages and the Elasticsearch /_cat/segments?v and /_cat/thread_pool/write?v endpoints to verify refresh cycles and detect rejections. The triage table below maps the top failure modes on this boundary from alert to fix.

Symptom	Diagnose	Resolve
Reads miss writes committed seconds ago	Compare commit timestamp vs index visibility; `IndexProvider` queue rising, `/_nodes/stats/indices/indexing` latency climbing	Replication window is stretching under load — throttle the producer, or apply `bulk-refresh=wait_for` / `await_index_visibility` only to the reads that need freshness
`wait_for` commits timing out or p99 latency spiking	`/_cat/thread_pool/write?v` shows non-zero rejections; blocking commits stall together	The index write pool is saturated — lower batch concurrency, scale index write threads, and verify a hot shard is not serializing refresh via Mixed Index Routing
Traversal timeouts during ingestion bursts	`nodetool tpstats` clean but driver throws `TimeoutException`; pool utilization at 100%	Starved pool under `wait_for`, not index lag — resize per the Connection Pooling model and cap batch concurrency
Duplicate / phantom index documents after retries	Retried non-idempotent mutation committed twice; index count exceeds storage count	Make every mutation idempotent (`mergeV`/`coalesce` on a stable key); reconcile by reindexing affected documents from the authoritative storage view

Frequently Asked Questions

Why do search queries return stale results right after a commit? Because index dispatch is asynchronous under the default eventual model. The commit returns once storage acknowledges at QUORUM; the index becomes searchable only after the replication window t_queue + t_bulk + t_refresh elapses. To read your own write, either route the read through a storage-backed lookup, poll the index until the document appears, or gate that write behind bulk-refresh=wait_for.

Should I ever set bulk-refresh=wait_for globally? No. Applied globally it serializes throughput behind index refresh and multiplies thread contention on the search cluster’s write pool. Apply it selectively to the writes whose immediate visibility is a business requirement and leave bulk ingestion on false.

Does raising the storage consistency level fix index staleness? No. Storage consistency (QUORUM, ALL) governs durability and read repair inside the storage cluster only. Index visibility is a separate downstream concern; raising the storage level increases write latency without shrinking the replication window.

How do I recover after a partition between JanusGraph and the index? JanusGraph tracks pending index mutations in its transaction log and replays queued updates on reconnect. If drift persists beyond the log horizon, run a REINDEX through the Management API during a maintenance window rather than dropping the index; the parent index synchronization guide covers the full repair procedure.

Up a level: External Index Synchronization & Consistency Tuning — the storage-to-index boundary this page tunes.
Eventual vs Strong Consistency Tradeoffs in JanusGraph — the SLA-driven decision framework in depth.
Elasticsearch Integration — transport, auth, and dispatch wiring for the index backend.
OpenSearch Sync Patterns — version-aware client mapping and drift reconciliation.
Mixed Index Routing — shard alignment that keeps wait_for commits cheap.
Connection Pooling — pool sizing for blocking versus fire-and-forget writes.