Should I set bulk-refresh=wait_for globally?

No. Globally it serializes throughput behind index refresh and multiplies thread contention. Apply it only to the writes whose immediate visibility is required and leave bulk ingestion on false.

External Index Synchronization & Consistency Tuning

JanusGraph decouples vertex/edge persistence from full-text and composite indexing to enable horizontal scalability, and this separation is the single largest source of silent production incidents in graph deployments: the storage backend commits transactions before the index backend has applied the corresponding mutations. This subsystem — the alignment between the storage layer (Cassandra or ScyllaDB) and the search layer (Elasticsearch or OpenSearch) — is where missing search results, phantom edges in traversal output, and cascading transaction timeouts originate. This guide is the reference for engineers who own that boundary in production. It builds directly on the JanusGraph Storage Backend Architecture foundation and pairs with the Graph Schema Validation & Modeling Strategies work that governs which properties reach the index in the first place. Proper tuning yields predictable query latency, bounded replication drift, and deterministic recovery paths; misconfiguration yields a graph that returns different answers depending on which subsystem you ask.

Storage commits are synchronous; index dispatch is asynchronous — the gap between the two is the replication window where drift originates.

The sequence below traces the two-phase write that creates the replication window between storage and index.

Steps ①–④ are synchronous and complete before the commit returns; steps ⑤–⑥ run asynchronously, and the window until the index is searchable is the drift.

Core Architecture & Consistency Boundaries

Graph mutations traverse a strict two-phase execution path. First, the transaction commits to the distributed storage layer using the configured consistency level. Second, JanusGraph serializes the mutation payload and dispatches index updates asynchronously through the IndexProvider interface. The IndexSerializer maps graph elements to inverted-index documents, and the underlying HTTP client pushes bulk requests to the search cluster. Three layers carry distinct responsibilities, and drift is always born at the seam between them:

Storage backend — persists vertices, edges, and properties; owns durability, partitioning, and replication. A LOCAL_QUORUM write here is the point of no return: once storage acknowledges, the mutation is authoritative regardless of index state.
Index backend — maintains the mixed indexes that answer full-text, range, and geo predicates. It is eventually consistent by default and trails storage by the replication window.
Compute / traversal layer — Gremlin Server plus your gremlin-python clients; opens transactions, decides which predicates hit storage versus the index, and is where read-your-writes expectations are violated when the window is not accounted for.

Drift originates precisely because step one is synchronous and step two is not. The replication window is the interval between the storage ack and the moment the index shard becomes searchable. You can model its lower bound as the sum of the queue wait, the bulk transport, and the index refresh interval:

W_{\text{drift}} = t_{\text{queue}} + t_{\text{bulk}} + t_{\text{refresh}}

With a default refresh_interval of one second, t_refresh alone guarantees roughly a one-second floor on visibility even when the queue and transport are idle. Every tuning decision below either shrinks one of these terms or makes the window explicit to the application instead of implicit and surprising. Because the storage and index consistency models are independent, you must reason about them separately — a decision covered in depth under Eventual vs Strong Consistency, where the tradeoff between acknowledging-before-indexing and blocking-until-searchable is analyzed against workload SLAs.

Production Configuration Reference

The following janusgraph.properties block is a hardened baseline. Every non-default value carries an operational rationale — do not copy it blind; each line changes a failure mode. Baseline transport configuration for Elasticsearch Integration establishes reliable dispatch; the equivalent wiring for OpenSearch Sync Patterns reuses the same elasticsearch backend value with version-aware client mapping.

properties

# --- Storage layer (authoritative writes) ---
storage.backend=cql
storage.hostname=10.0.1.10,10.0.1.11,10.0.1.12
storage.cql.keyspace=janusgraph_prod
storage.cql.local-datacenter=dc1
# LOCAL_QUORUM survives one node loss per rack without blocking on remote DCs.
# ALL bottlenecks under concurrent writes; ONE risks acking data that is later lost.
storage.cql.read-consistency-level=LOCAL_QUORUM
storage.cql.write-consistency-level=LOCAL_QUORUM

# --- Index layer (eventual by design) ---
# The 'elasticsearch' backend value also drives OpenSearch clusters.
index.search.backend=elasticsearch
index.search.hostname=es-cluster.internal:9200
index.search.elasticsearch.client-only=true
# Auth is required in production; never ship an open index port.
index.search.elasticsearch.http.auth.type=basic
index.search.elasticsearch.http.auth.basic.username=janus_sync
index.search.elasticsearch.http.auth.basic.password=${ES_SYNC_PASS}
# Fail fast on a partition rather than pinning worker threads on dead sockets.
index.search.elasticsearch.http.connection-timeout=10000
index.search.elasticsearch.http.socket-timeout=60000

# --- Sync behaviour (the tuning surface) ---
# false = fire-and-forget (max throughput); wait_for = block until searchable.
index.search.elasticsearch.bulk-refresh=false
# Cap concurrent bulk requests so a write burst cannot exhaust the ES write pool.
index.search.elasticsearch.http.max-connections=50
index.search.elasticsearch.http.max-connections-per-route=20
# Batch and retry bounds for sustained ingestion.
index.search.elasticsearch.bulk-size=1000
index.search.elasticsearch.max-retry-time=300000

Storage consistency levels operate independently of index sync. A LOCAL_QUORUM write to ScyllaDB guarantees durability across the rack but says nothing about index visibility — decoupling the two prevents an index-cluster scaling event from stalling storage commits. Keep this baseline aligned with the keyspace-level Replication Strategies you deploy: a mismatch between the storage replication factor and the index refresh policy produces read-repair storms on one side and stale reads on the other. For rack-aware write tuning that complements this block, the official ScyllaDB consistency documentation is the upstream reference.

Index Backend Wiring & Synchronization Mechanics

Asynchronous dispatch is non-negotiable for production throughput. Synchronous index commits would serialize storage transactions behind index refresh cycles, amplify write latency, and turn every index-cluster maintenance window into a storage outage. The tradeoff is that you now own an in-flight queue, and an unbounded queue is a memory leak with a delay.

Async dispatch model. A background worker drains the mutation queue and assembles bulk requests up to bulk-size. Setting bulk-refresh=wait_for forces the index backend to acknowledge only after the relevant shards are searchable — this collapses the t_refresh term to near zero for those writes but adds roughly 15–40 ms per batch and multiplies thread contention on the search cluster. For high-throughput pipelines, prefer bulk-refresh=false and implement application-level read-your-writes handling: cache the mutation locally, or issue an explicit _refresh only at transaction boundaries where a subsequent read genuinely depends on it.

Queue depth and backpressure. The queue is the shock absorber between a bursty pipeline and a finite index cluster. When ingestion sustains above the index’s flush-and-merge capacity, the queue grows, heap pressure rises, and eventually the HTTP client begins rejecting on max-connections. Backpressure must propagate to the producer: bound your batch size, watch the ES write thread-pool queue depth, and slow the pipeline when rejections appear rather than retrying into a saturated cluster. The max-connections-per-route cap exists specifically so one hot shard route cannot consume the entire connection budget.

Routing and shard alignment. Mixed indexes combine storage-backed property lookups with full-text or range queries, and the routing strategy directly impacts sync efficiency and query fan-out. By default JanusGraph relies on Elasticsearch/OpenSearch auto-routing, which produces hot shards when graph partitions have skewed degree distributions. Set shard and replica counts at index-creation time and mirror the storage topology — over-sharding inflates heap and slows recovery, under-sharding bottlenecks concurrent mutation bursts. The full decision logic lives in Mixed Index Routing.

properties

# Shard/replica counts are fixed at index creation via ext.* — plan before first mutation.
index.search.elasticsearch.create.ext.number_of_shards=12
index.search.elasticsearch.create.ext.number_of_replicas=1

For payload-sizing limits that prevent a TooLargeRequestException during high-velocity sync, the official Elasticsearch Bulk API reference documents the hard bounds your bulk-size must stay under.

Python Pipeline Orchestration

ETL pipelines and Spark connectors frequently drive ingestion through gremlin-python. The Gremlin Server session model does not auto-commit; a failure mid-pipeline leaves partial state in storage and a half-drained index queue. Every batch loader must therefore enforce explicit transaction boundaries, idempotent mutations, and bounded retry. The pattern below commits in fixed batches, rolls back cleanly on failure, and retries transient transport errors with exponential backoff and jitter.

python

import time
import random
from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
from gremlin_python.process.anonymous_traversal import traversal
from gremlin_python.driver.protocol import GremlinServerError

connection = DriverRemoteConnection('ws://janusgraph-server:8182/gremlin', 'g')
g = traversal().withRemote(connection)

RETRYABLE = (ConnectionError, TimeoutError, GremlinServerError)

def _commit_with_retry(tx, attempts=5):
    for i in range(attempts):
        try:
            tx.commit()
            return
        except RETRYABLE as e:
            if i == attempts - 1:
                raise
            # Exponential backoff with jitter avoids thundering-herd on a recovering cluster.
            sleep = min(30, (2 ** i)) + random.uniform(0, 0.5)
            time.sleep(sleep)

def batch_ingest(vertices, batch_size=500):
    tx = g.tx()
    gtx = tx.begin()  # transaction-bound traversal source
    try:
        for i in range(0, len(vertices), batch_size):
            for v in vertices[i:i + batch_size]:
                # mergeV matches on the id key, so a replayed batch updates
                # rather than duplicating — the mutation is idempotent.
                gtx.mergeV({'id': v['id']}) \
                   .property('label', v['label']) \
                   .property('data', v['data']).iterate()
            _commit_with_retry(tx)     # flush this batch to CQL, queue index mutations
            gtx = tx.begin()           # fresh transaction for the next batch
    except Exception as e:
        tx.rollback()
        raise RuntimeError(f"Ingestion failed near offset {i}: {e}") from e
    finally:
        connection.close()

Pipeline rules that keep storage and index aligned:

Use .iterate() for mutations, never .toList() — materializing result sets wastes heap and does nothing for a write.
Commit at 200–500 mutations per transaction to balance CQL write amplification against transaction-log overhead.
Make every mutation idempotent (mergeV/mergeE on a stable key) so a retried batch cannot double-write, which would otherwise surface as duplicate index documents.
Size the driver pool to match your write concurrency; a starved pool produces TimeoutException that looks identical to index lag. See Connection Pooling for the sizing model.
For traversal and session semantics, the official Apache TinkerPop Gremlin documentation is authoritative.

Bulk-loading path. When a loader bypasses the transactional API for raw throughput, it also bypasses the async index queue unless index dispatch runs in parallel with storage commits. During large loads, set storage.batch-loading=true, and disable the index refresh (index.refresh_interval=-1 via the _settings API) to stop excessive segment creation. After the load, restore 5s–30s, force a refresh, and force-merge before routing production traffic:

bash

# Make all bulk-loaded documents searchable in one step.
curl -X POST "http://es-cluster:9200/janusgraph_mixed/_refresh"

# Collapse segments after a large load to restore query latency.
curl -X POST "http://es-cluster:9200/janusgraph_mixed/_forcemerge?max_num_segments=5"

For ScyllaDB-backed deployments, run nodetool repair before triggering index rebuilds so the index is not populated from an under-replicated storage view. Teams evaluating the storage side should review ScyllaDB Migration for the read/write consistency benchmarks that determine how tight the index window can safely be.

Diagnostics & Index Repair

You cannot tune what you cannot see. Expose JanusGraph JMX metrics through Prometheus and correlate them with index-cluster and storage signals — drift always shows up as divergence between two of these series before it shows up in a user complaint.

org.janusgraph.diskstorage.indexing.IndexProvider — index queue size and bulk-flush duration. A monotonically rising queue is backpressure; a rising flush duration is an index cluster falling behind.
Elasticsearch /_cat/thread_pool/write?v — write queue depth and rejection counts. Non-zero rejections mean your producer is outrunning the search cluster.
Elasticsearch /_nodes/stats/indices/indexing — index latency trend, the leading indicator of a growing replication window.
ScyllaDB/Cassandra nodetool tpstats — storage backlog, to distinguish a storage stall from an index stall.

When drift exceeds threshold, reindex through the Management API rather than dropping the index. REINDEX reads from the authoritative storage backend and pushes deltas without taking the index offline:

java

JanusGraphManagement mgmt = graph.openManagement();
JanusGraphIndex index = mgmt.getGraphIndex("search");
mgmt.updateIndex(index, SchemaAction.REINDEX).get();
mgmt.commit();
// Block until the index reports ENABLED before routing production reads.
ManagementSystem.awaitGraphIndexStatus(graph, "search").call();

Always run REINDEX inside a maintenance window with bulk-refresh=false to minimize cluster load. For catastrophic index corruption, drop and recreate the mixed index, then execute a full storage-to-index sync with storage.batch-loading disabled to prevent transaction-log overflow, and confirm getIndexStatus(key) returns ENABLED before serving queries.

Failure-Mode Reference

The four failure modes below account for the large majority of index-sync incidents. Each row is symptom → diagnosis command → resolution so an on-call engineer can move from alert to fix without a design discussion.

Symptom	Diagnose	Resolve
Recent writes missing from full-text results	`curl /_nodes/stats/indices/indexing` shows rising latency; `IndexProvider` queue climbing	Replication window is stretching under load — throttle the producer, or set `bulk-refresh=wait_for` only on the writes that need read-your-writes
Phantom edges: traversal returns elements that no longer exist	`getIndexStatus(key)` not `ENABLED`; index documents outlive deleted storage rows	Reindex via Management API `SchemaAction.REINDEX`; if corrupted, drop/recreate the mixed index and full-sync from storage
`EsRejectedExecutionException` / bulk rejections	`/_cat/thread_pool/write?v` shows non-zero rejections	Lower `bulk-size` and `max-connections`; add producer-side backpressure; scale index write threads before retrying
Traversal timeouts during ingestion bursts	`nodetool tpstats` clean but driver throws `TimeoutException`; pool utilization at 100%	Starved driver pool, not index lag — resize per the Connection Pooling model and cap batch concurrency

Frequently Asked Questions

Why do search queries return stale results right after a commit? Because index dispatch is asynchronous. The commit returns as soon as storage acknowledges at LOCAL_QUORUM; the index becomes searchable only after the replication window t_queue + t_bulk + t_refresh elapses. If a read must see its own write, gate it behind bulk-refresh=wait_for or an explicit _refresh at the transaction boundary.

Should I ever set bulk-refresh=wait_for globally? No. Applied globally it serializes throughput behind index refresh and multiplies thread contention. Apply it selectively to the small set of writes whose immediate visibility is a business requirement, and leave the bulk of ingestion on false.

Does a stronger storage consistency level fix index drift? No. Storage consistency (LOCAL_QUORUM, ALL) governs durability and read repair inside the storage cluster only. Index visibility is a separate, downstream concern — raising the storage level increases write latency without shrinking the replication window.

How do I recover after a network partition between JanusGraph and the index? JanusGraph tracks pending index mutations in its transaction log and replays queued updates on reconnect. If drift persists beyond the log horizon, run a REINDEX through the Management API during a maintenance window; for corruption, drop and recreate the mixed index and full-sync from storage.

Up a level: JanusGraph Storage Backend Architecture & Configuration — the storage foundation this index-sync work sits on top of.
Elasticsearch Integration — transport, auth, and dispatch wiring for Elasticsearch backends.
OpenSearch Sync Patterns — version-aware client mapping and drift reconciliation for OpenSearch.
Eventual vs Strong Consistency — choosing the acknowledgment boundary against workload SLAs.
Mixed Index Routing — shard alignment and predicate routing to prevent hot shards.
Graph Schema Validation & Modeling Strategies — the sibling area governing which properties are indexed at all.