This page offers guidance on how to integrate and use the Valyu DeepSearch API effectively to get the best results.

Building Agentic Search Workflows

Valyu delivers optimal performance when integrated into agentic search workflows rather than single-shot queries. The API is engineered for precision-driven searches where your AI system can precisely search what knowledge it needs. Recommended Architecture:
# Multi-step agentic search workflow
async def research_agent(query: str):
    # Step 1: Break down complex query into focused searches
    sub_queries = decompose_query(query)
    
    results = {}
    for i, sub_query in enumerate(sub_queries):
        # Step 2: Adapt search strategy based on previous findings
        strategy = adapt_strategy(sub_query, results)
        
        search_result = valyu.search(
            query=sub_query,
            included_sources=strategy.sources,
            max_price=strategy.budget,
            relevance_threshold=0.65
        )
        results[f"step_{i}"] = search_result
        
        # Step 3: Identify gaps and refine next search
        gaps = identify_knowledge_gaps(search_result, query)
        if gaps:
            gap_result = valyu.search(
                query=gaps[0].refined_query,
                included_sources=gaps[0].target_sources,
                max_price=50.0
            )
            results[f"gap_fill_{i}"] = gap_result
    
    # Step 4: Cross-validate and synthesize findings
    return synthesize_multi_source_findings(results)
Agentic advantage: Technical domains like research, finance, and medicine benefit most from multi-step search workflows that leverage Valyu’s comprehensive search indexes across academic, financial, and proprietary sources.

Human vs. AI Search Optimization

Valyu’s search algorithms are optimized for AI model consumption, not human browsing patterns. The semantic ranking and result structuring are designed for LLM tool calls and agent workflows. For AI-driven searches (recommended):
# Optimized for LLM consumption
response = valyu.search(
    "quantum error correction surface codes LDPC performance benchmarks",
    tool_call_mode=True,  # Default: AI-optimized results
    max_num_results=8,
    relevance_threshold=0.6
)
For human-facing searches:
# Adjusted for human readability
response = valyu.search(
    "quantum computing error correction methods",
    tool_call_mode=False,  # Human-optimized formatting
    max_num_results=5,
    results_length="medium"
)

Maximizing Valyu’s Search Parameters

Combine optimised prompts with Valyu’s granular parameters for increased performance:
response = valyu.search(
    "GPT-4 vs GPT-3 architectural innovations: training efficiency, inference optimization, and benchmark comparisons",
    search_type="proprietary",
    max_num_results=10,
    relevance_threshold=0.6,
    included_sources=["valyu/valyu-arxiv"],
    max_price=50.0,
    category="machine learning"
    start_date="2024-01-01",
    end_date="2024-12-31"
)
Pro tip: Leverage Valyu’s beyond-the-web capabilities with included_sources like valyu/valyu-arxiv for academic content, financial market data, or specialized datasets that other APIs can’t access.

Quality and Budget Optimization

Scaling Search Quality with Budget

Not getting sufficient results? Increase your max_price parameter to access higher-quality sources and expand coverage across Valyu’s proprietary datasets.
# Budget progression for increasing quality
search_configs = [
    {"max_price": 25.0, "use_case": "Quick fact-checking"},
    {"max_price": 50.0, "use_case": "Standard research queries"},
    {"max_price": 100.0, "use_case": "Comprehensive analysis"},
]
Budget Impact on Results:
  • $25: Basic web sources + limited academic content
  • $50: Full web coverage + major academic databases
  • $100: Premium sources + financial data + specialized datasets
Cost optimization: Higher budgets unlock authoritative sources that other APIs can’t access, including exclusive academic publishers, financial terminals, and curated research databases.

Context Window Management

Worried about token consumption? Valyu provides granular controls for managing LLM context usage:
# Optimize for different context requirements
lightweight_search = valyu.search(
    query="transformer architecture innovations",
    max_num_results=3,        # Fewer results
    results_length="short",   # Condensed content
    max_price=50.0
)

comprehensive_search = valyu.search(
    query="transformer architecture innovations",
    max_num_results=15,       # More coverage
    results_length="long",    # Full content
    max_price=100.0
)
Token Estimation Guide:
  • Short results: Max ~6k tokens per result (25k chars)
  • Medium results: Max ~12k tokens per result (50k chars)
  • Long results: Max ~24k tokens per result (100k chars)
  • Rule of thumb: 4 characters ≈ 1 token
Context strategy: Start with max_num_results=10 and results_length="short" for most use cases, then adjust based on your LLM’s context window and accuracy requirements.

Discovering Specialized Datasets

Exploring the Valyu Curated Datasets

Access curated, high-quality datasets beyond standard web search Visit: Valyu Platform Datasets Dataset Categories:
  • Academic: ArXiv, PubMed, academic publisher content
  • Financial: SEC filings, earnings reports, market data
  • Technical: Patents, specifications, implementation guides
  • Books & Literature: Digitized texts, reference materials
Each dataset provides specific return schemas optimized for different use cases:
# Target specific datasets for specialized searches
academic_search = valyu.search(
    "CRISPR gene editing clinical trials safety outcomes",
    included_sources=["valyu/valyu-pubmed", "valyu/valyu-US-clinical-trials"],
    max_price=30.0
)

financial_search = valyu.search(
    "Tesla Q3 2024 earnings revenue breakdown",
    included_sources=["valyu/valyu-US-sec-filings", "valyu/valyu-US-earnings"],
    max_price=60.0
)
Data advantage: Proprietary datasets on the exchange often contain information unavailable through standard web APIs, giving your AI system access to authoritative, structured knowledge that improves factual accuracy.

Optimizing your AI Integration

Avoiding Common Integration Issues

  1. Token waste: Use max_num_results and results_length parameters to manage LLM context consumption
  2. Missing filters: Always use Valyu’s relevance thresholds and source controls for precision
  3. Ignoring cost optimization: Balance max_price with result quality needs based on your use case
  4. Wrong source expectations: Match dataset selection to your specific domain needs - academic, financial, or web sources
  5. Inefficient workflows: Implement agentic search patterns rather than single-shot queries for complex research tasks

Start Building with Valyu

Ready to integrate production-grade search into your AI stack?

Developer Support

Building something ambitious? Our team helps optimize search strategies for mission-critical AI applications:
Performance tip: The most effective integrations combine domain-specific workflows with Valyu’s search controls. Start with agentic patterns, then optimize based on your AI system’s specific requirements.