Mixed Index Routing

Mixed Index Routing in Apache JanusGraph dictates how vertex and edge property predicates are dispatched to external search backends. When graph workloads exceed composite index boundaries, routing decisions directly impact query latency, index consistency, and cluster stability. This guide covers production-grade routing configuration, pipeline orchestration, and consistency tuning for reliable External Index Synchronization & Consistency Tuning workflows.

The decision tree below shows how JanusGraph selects an index backend based on the query predicate.

flowchart TD
    Q["Incoming traversal"] --> D{"Predicate type?"}
    D -->|"full-text / geo / range"| MX["Mixed index (ES/OpenSearch)"]
    D -->|"equality / composite key"| CP["Composite index (storage)"]
    D -->|"unindexed"| FS["Full scan (avoid)"]
    classDef warn fill:#fdecea,stroke:#c0392b,color:#0f2730;
    class FS warn

Routing Mechanics & Backend Selection

JanusGraph evaluates mixed index routing at query execution time. The MixedIndex abstraction inspects the traversal predicate, matches it against registered index definitions, and dispatches the request to the configured external backend. Routing failures typically originate from three operational gaps: unbound property keys, degraded backend health, or mapping drift between the storage layer and the search cluster.

To enforce deterministic routing, bind property keys explicitly to dedicated index backends using index.<name>.backend and index.<name>.hostname directives. When operating dual backends or executing live migrations, routing logic must account for analyzer differences, tokenization rules, and shard allocation strategies. This becomes critical when transitioning between Elasticsearch Integration and OpenSearch Sync Patterns, as default field mappings and text analyzers diverge across major versions. Misaligned routing triggers full graph scans, stale reads, or IndexNotFoundException errors under high concurrency.

Production Configuration Tuning

JanusGraph routing behavior is governed by janusgraph.properties. The following configuration optimizes for high-throughput ingestion, controlled visibility lag, and stable backend routing:

properties
# Storage Backend
storage.backend=cql
storage.hostname=cassandra-01,cassandra-02,cassandra-03
storage.cql.keyspace=janusgraph_prod

# Mixed Index Routing
index.search.backend=elasticsearch
index.search.hostname=es-01,es-02,es-03
index.search.elasticsearch.client-only=true
index.search.elasticsearch.http-auth-basic=graph_user:secure_pass

# Consistency & Sync Tuning
index.search.elasticsearch.create.ext.number_of_shards=3
index.search.elasticsearch.create.ext.number_of_replicas=1
index.search.elasticsearch.refresh-interval=5s

# Routing & Fallback Behavior
index.search.elasticsearch.bulk-refresh=false
index.search.elasticsearch.force-index-name=janusgraph_mixed
index.search.elasticsearch.mapping-mode=DEFAULT
index.search.elasticsearch.bulk-size=1000

Key routing implications:

  • bulk-refresh=false disables synchronous index refresh per write batch, preventing write amplification during peak ingestion.
  • refresh-interval=5s caps visibility lag. Lower values increase I/O overhead on the search cluster without guaranteeing transactional consistency.
  • force-index-name prevents dynamic index creation, ensuring routing targets a known, pre-allocated index topology.
  • mapping-mode=DEFAULT enforces strict schema validation, rejecting untyped property mutations that could corrupt routing metadata.

Consistency Models & Synchronization

JanusGraph decouples storage consistency from index visibility. The Cassandra storage backend guarantees tunable consistency via QUORUM or LOCAL_QUORUM for vertex and edge mutations. Index synchronization operates asynchronously by default, introducing eventual consistency between the graph store and the search backend.

Routing decisions must account for this propagation delay. Queries dispatched to the mixed index may return stale results if executed immediately after a write. To mitigate this, align index.<name>.read-consistency with QUORUM and implement application-level read-after-write reconciliation. For workloads requiring strict ordering, configure index.<name>.write-consistency=QUORUM and monitor replication lag. Detailed strategies for managing these trade-offs are documented in Configuring Mixed Index Fallback Chains.

Python Pipeline Integration

Production pipelines must handle transient routing failures, backend timeouts, and consistency gaps. The following Python example demonstrates a resilient traversal pattern using gremlinpython with exponential backoff and explicit routing validation.

python
import time
import random
import logging
from gremlin_python.driver import client, serializer
from gremlin_python.process.graph_traversal import __

logger = logging.getLogger(__name__)

# Configure Gremlin client with routing-aware timeouts
gremlin_client = client.Client(
    "ws://graph-cluster:8182/gremlin",
    "g",
    message_serializer=serializer.GraphSONSerializersV3d0(),
    pool_size=4,
    max_in_process_per_connection=10,
    connection_timeout=5.0
)

def execute_with_backoff(query: str, max_retries: int = 4, base_delay: float = 2.0):
    """
    Executes a mixed-index routed traversal with exponential backoff and jitter.
    """
    for attempt in range(max_retries):
        try:
            callback = gremlin_client.submit(query)
            results = callback.all().result()
            return results
        except Exception as e:
            delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
            logger.warning(f"Routing attempt {attempt + 1} failed: {e}. Retrying in {delay:.2f}s")
            time.sleep(delay)
    raise ConnectionError("Mixed index backend unreachable after max retries. Trigger fallback.")

def run_mixed_index_pipeline(predicate: str, limit: int = 100):
    query = f"g.V().hasLabel('user').has('email', {predicate}).limit({limit}).elementMap()"
    try:
        vertices = execute_with_backoff(query)
        if not vertices:
            logger.info("Mixed index routing returned empty set. Verify index sync lag.")
        return vertices
    except Exception as e:
        logger.critical(f"Pipeline degraded. Switching to storage-level scan: {e}")
        # Fallback to composite index or full scan logic here
        return []

Operational Validation & Fallbacks

Validate routing topology before deploying to production. Run ManagementSystem.verifyIndex() to confirm property key bindings match the target backend. Monitor janusgraph.index.search.elasticsearch.client.errors and janusgraph.index.search.elasticsearch.client.success metrics to detect routing degradation. Implement circuit breakers at the application layer to isolate search backend failures from the Cassandra storage tier.

When routing fails, JanusGraph defaults to a full graph scan unless fallback chains are explicitly configured. Pre-define composite indexes for critical traversal paths to ensure graceful degradation. Align index shard counts with Cassandra token ranges to prevent hot-spotting during bulk synchronization. Reference the official Mixed Index documentation for schema validation procedures and review Elasticsearch refresh semantics to calibrate visibility windows against ingestion throughput.