The Search API provides powerful search capabilities across web and proprietary data sources, returning relevant content optimized for AI applications and RAG pipelines.

Basic Usage

from valyu import Valyu

valyu = Valyu()

response = valyu.search(
    "What are the latest developments in quantum computing?"
)

print(f"Found {len(response.results)} results")
for result in response.results:
    print(f"Title: {result.title}")
    print(f"URL: {result.url}")
    print(f"Content: {result.content[:200]}...")

Parameters

Query (Required)

ParameterTypeDescription
querystrThe search query string

Options (Optional)

ParameterTypeDescriptionDefault
search_type"web" | "proprietary" | "all"Type of search to perform. "web" for web search, "proprietary" for Valyu datasets, "all" for both"all"
max_num_resultsintMaximum number of results to return (1-20)10
max_priceintMaximum cost in dollars per thousand retrievals (CPM)30.0
is_tool_callboolSet to True when called by AI agents/tools, False for direct user queriesTrue
relevance_thresholdfloatMinimum relevance score for results (0.0-1.0)0.5
included_sourcesList[str]List of specific sources to search within (URLs, domains, or dataset names)None
excluded_sourcesList[str]List of sources to exclude from search resultsNone
categorystrNatural language category to guide search contextNone
start_datestrStart date for time filtering (YYYY-MM-DD format)None
end_datestrEnd date for time filtering (YYYY-MM-DD format)None
country_codestr2-letter ISO country code to bias search resultsNone
response_lengthstr | intContent length per result: "short" (25k), "medium" (50k), "large" (100k), "max" (full), or custom character count"short"
fast_modeboolEnable fast mode for reduced latency but shorter resultsFalse

Response Format

class SearchResponse:
    success: bool
    error: Optional[str]
    tx_id: str
    query: str
    results: List[SearchResult]
    results_by_source: ResultsBySource
    total_deduction_pcm: float
    total_deduction_dollars: float
    total_characters: int

class SearchResult:
    title: str
    url: str
    content: str
    description: Optional[str]
    source: str
    price: float
    length: int
    relevance_score: float
    data_type: Optional[str]  # "structured" | "unstructured"
    # Additional fields for academic sources
    publication_date: Optional[str]
    authors: Optional[List[str]]
    citation: Optional[str]
    doi: Optional[str]
    # ... other optional fields

Parameter Examples

Fast Mode

Enable fast mode for quicker results:
response = valyu.search(query, fast_mode=True)
Responses will be quicker but the content will be shorter.

Search Type Configuration

Control which data sources to search:
# Web search only
web_response = valyu.search(query, search_type="web", max_num_results=10)

# Proprietary datasets only
proprietary_response = valyu.search(query, search_type="proprietary", max_num_results=8)

# Both web and proprietary (default)
all_response = valyu.search(query, search_type="all", max_num_results=12)

Source Filtering

Control which specific sources to include or exclude:
response = valyu.search(
    "quantum computing applications",
    search_type="all",
    max_num_results=10,
    included_sources=["nature.com", "science.org", "valyu/valyu-arxiv"],
    response_length="medium"
)
response = valyu.search(
    "quantum computing applications",
    search_type="all",
    max_num_results=10,
    excluded_sources=["reddit.com", "quora.com"],
    response_length="medium"
)
You can either include or exclude sources, but not both.

Geographic and Date Filtering

Bias results by location and time range:
response = valyu.search(
    "renewable energy policies",
    country_code="US",
    start_date="2024-01-01",
    end_date="2024-12-31",
    max_num_results=7,
    category="government policy"
)

Response Length Control

Customize content length per result:
# Predefined lengths
short_response = valyu.search(query, response_length="short")  # 25k characters
medium_response = valyu.search(query, response_length="medium")  # 50k characters
large_response = valyu.search(query, response_length="large")  # 100k characters

# Custom character limit
custom_response = valyu.search(query, response_length=15000)  # Custom limit

Use Case Examples

Academic Research Assistant

Build a comprehensive research tool that searches across academic databases:
def academic_research(query: str):
    response = valyu.search(
        query,
        search_type="proprietary",
        included_sources=["valyu/valyu-pubmed", "valyu/valyu-arxiv"],
        max_num_results=15,
        response_length="large",
        category="academic research"
    )

    if response.success:
        print(f"=== Academic Research Results ===")
        print(f"Found {len(response.results)} papers for: \"{query}\"")
        
        # Group by source
        arxiv_papers = [r for r in response.results if "arxiv" in r.source]
        pubmed_papers = [r for r in response.results if "pubmed" in r.source]
        
        print(f"\nArXiv Papers: {len(arxiv_papers)}")
        for i, paper in enumerate(arxiv_papers, 1):
            print(f"{i}. {paper.title}")
            print(f"   Relevance: {paper.relevance_score:.2f}")
            print(f"   URL: {paper.url}")
            if paper.publication_date:
                print(f"   Published: {paper.publication_date}")
            if paper.authors:
                print(f"   Authors: {', '.join(paper.authors)}")
        
        print(f"\nPubMed Articles: {len(pubmed_papers)}")
        for i, article in enumerate(pubmed_papers, 1):
            print(f"{i}. {article.title}")
            print(f"   Relevance: {article.relevance_score:.2f}")
            if article.citation:
                print(f"   Citation: {article.citation}")
        
        return {
            "arxiv": arxiv_papers,
            "pubmed": pubmed_papers,
            "query": query
        }
    
    return None

# Usage examples - dates are included in natural language
covid_research = academic_research(
    "COVID-19 vaccine efficacy studies published between 2023 and 2024"
)

ai_research = academic_research(
    "recent machine learning breakthroughs in the last 2 years"
)

climate_research = academic_research(
    "climate change mitigation strategies peer-reviewed research since 2020"
)

Financial Market Intelligence

Create a financial analysis tool that searches market data and news:
def financial_intelligence(query: str, analysis_type: str):
    """analysis_type: 'fundamental', 'technical', or 'news'"""
    sources = {
        "fundamental": ["valyu/valyu-stocks-US", "sec.gov", "investor.com"],
        "technical": ["valyu/valyu-stocks-US", "tradingview.com", "yahoo.com"],
        "news": ["bloomberg.com", "cnbc.com", "reuters.com", "marketwatch.com"]
    }

    response = valyu.search(
        query,
        search_type="all" if analysis_type in ["fundamental", "technical"] else "web",
        included_sources=sources[analysis_type],
        max_num_results=10,
        response_length="medium",
        category="financial analysis"
    )

    if response.success:
        print(f"=== {analysis_type.upper()} Analysis ===")
        print(f"Query: \"{query}\"")
        
        for i, result in enumerate(response.results, 1):
            print(f"\n{i}. {result.title}")
            print(f"   Source: {result.source}")
            print(f"   Relevance: {result.relevance_score:.2f}")
            print(f"   URL: {result.url}")
            
            # Show excerpt for financial data
            if len(result.content) > 200:
                print(f"   Preview: {result.content[:200]}...")
            
            if result.publication_date:
                print(f"   Date: {result.publication_date}")
        
        return {
            "results": response.results,
            "analysis_type": analysis_type,
            "query": query
        }
    
    return None

# Usage examples - include timeframes in natural language
tesla_fundamentals = financial_intelligence(
    "Tesla financial performance Q3 2024 earnings revenue profit margins", 
    "fundamental"
)

apple_news = financial_intelligence(
    "Apple latest news this week product announcements stock updates", 
    "news"
)

bitcoin_technical = financial_intelligence(
    "Bitcoin price analysis technical indicators support resistance levels recent trends",
    "technical"
)

Real-time News Monitoring

Build a news monitoring system that tracks specific topics across multiple sources:
def news_monitoring(queries: List[str]):
    all_results = []
    
    for query in queries:
        response = valyu.search(
            query,
            search_type="web",
            included_sources=[
                "reuters.com",
                "bloomberg.com", 
                "cnbc.com",
                "techcrunch.com",
                "theverge.com"
            ],
            max_num_results=8,
            response_length="short"
        )

        if response.success:
            all_results.append({
                "query": query,
                "articles": response.results,
                "count": len(response.results)
            })

    # Generate monitoring report
    print(f"=== News Monitoring Report ===")
    print(f"Monitoring {len(queries)} topics\n")
    
    for result in all_results:
        query, articles, count = result["query"], result["articles"], result["count"]
        print(f"📰 {query.upper()}: {count} articles")
        
        for i, article in enumerate(articles, 1):
            print(f"   {i}. {article.title}")
            print(f"      Source: {article.source} | Relevance: {article.relevance_score:.2f}")
            print(f"      URL: {article.url}")
        print("")

    return all_results

# Usage examples - timeframes included in natural language
tech_news = news_monitoring([
    "artificial intelligence breakthroughs this week",
    "quantum computing progress recent developments", 
    "cryptocurrency regulation news today"
])

business_news = news_monitoring([
    "Tesla earnings report Q4 2024",
    "Federal Reserve interest rate decisions this month",
    "tech layoffs news recent announcements"
])

Error Handling

The Search API includes comprehensive error handling and validation:
response = valyu.search("test query", max_num_results=5)

if not response.success:
    print("Search failed:", response.error)
    
    # Handle specific error cases
    if "insufficient credits" in response.error:
        print("Please add more credits to your account")
    elif "invalid" in response.error:
        print("Check your search parameters")
    
    return

# Process successful results
print(f"Transaction ID: {response.tx_id}")

for index, result in enumerate(response.results):
    print(f"{index + 1}. {result.title}")
    print(f"   Relevance: {result.relevance_score}")
    print(f"   Source: {result.source}")
    print(f"   URL: {result.url}")

Source Types

Web Sources

  • General websites and domains
  • News sites and blogs
  • Forums and community sites
  • Documentation sites

Proprietary Sources

  • valyu/valyu-arxiv - Academic papers from arXiv
  • valyu/valyu-pubmed - Medical and life science literature
  • valyu/valyu-stocks-US - US stock market data
  • And many more. Check out the Valyu Platform Datasets for more information.