Take precise control over your search sources with Valyu’s source filtering capabilities. Whether you need to focus on specific authoritative domains, access particular datasets, or exclude unreliable sources, source filtering ensures your AI gets exactly the right content.
Source filters accept domains, URLs, dataset names, or specific paths. Both included_sources and excluded_sources are optional arrays where included will override excluded.

Why Use Source Filtering?

Source filtering provides granular control over your search results, enabling you to:
  • 🎯 Target authoritative sources - Focus on trusted domains and academic datasets
  • 🚫 Block unreliable content - Exclude known low-quality or biased sources
  • πŸ“š Access specific datasets - Search within Valyu’s proprietary academic collections
  • ⚑ Improve result quality - Get more relevant, higher-quality information

Parameters Reference

included_sources

Type: Array of stringsOnly search within these specific sources. Can include domains, URLs, or dataset names.Example: ["arxiv.org", "valyu/valyu-pubmed"]

excluded_sources

Type: Array of stringsExclude these sources from search results. Supports the same formats as included_sources.Example: ["reddit.com", "news.ycombinator.com"]
If both included_sources and excluded_sources are provided, the included sources will override the excluded sources.

Source Format Options

Valyu supports multiple source specification formats for maximum flexibility:
FormatExampleWhat It Does
Domain Name"arxiv.org"Includes/excludes entire domain
Base URL"https://docs.aws.amazon.com"Includes/excludes entire site
Specific Path"techcrunch.com/news"Targets only that specific path
Dataset Name"valyu/valyu-arxiv"Searches Valyu’s proprietary datasets
Path Specificity: When using specific paths (e.g., "valyu.network/blog"), only that exact path is included/excluded. For entire domains, use just the domain name or base URL.

Quick Start Examples

Focus on Academic Sources

Perfect for research requiring peer-reviewed, scholarly content:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="quantum computing error correction",
    included_sources=[
        "arxiv.org",
        "valyu/valyu-arxiv", 
        "valyu/valyu-pubmed",
        "nature.com",
        "science.org"
    ],
    search_type="all"
)

Exclude Social Media and Forums

Remove noise from social platforms and discussion forums:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="artificial intelligence safety research",
    excluded_sources=[
        "reddit.com",
        "news.ycombinator.com", 
        "twitter.com",
        "linkedin.com",
        "quora.com"
    ]
)

Target Specific Documentation Sites

Focus on official documentation and technical resources:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="React server components best practices",
    included_sources=[
        "https://react.dev/",
        "https://nextjs.org/docs",
        "https://docs.aws.amazon.com/",
        "developer.mozilla.org"
    ]
)

Advanced Usage Patterns

Combine Include and Exclude Filters

Use both filters together for maximum precision:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="machine learning model deployment",
    included_sources=[
        "docs.aws.amazon.com",
        "cloud.google.com/docs", 
        "docs.microsoft.com/azure",
        "kubernetes.io"
    ],
    excluded_sources=[
        "stackoverflow.com",  # Exclude forums
        "medium.com"          # Exclude blog posts
    ]
)

Dataset-Specific Academic Research

Access Valyu’s proprietary academic datasets:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="CRISPR gene therapy clinical trials",
    included_sources=[
        "valyu/valyu-pubmed",    # Medical research
        "valyu/valyu-arxiv",     # Preprints and papers
        "clinicaltrials.gov"     # Trial data
    ],
    search_type="proprietary"
)

Financial Data Sources

Target authoritative financial and market data:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="Federal Reserve interest rate policy 2024",
    included_sources=[
        "federalreserve.gov",
        "sec.gov",
        "reuters.com/business",
        "bloomberg.com",
        "wsj.com"
    ],
    excluded_sources=[
        "reddit.com/r/investing",
        "seekingalpha.com"  # Exclude opinion pieces
    ]
)

Use Case Examples

πŸ“Š Financial Research

Target authoritative financial sources for market analysis and economic research:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="cryptocurrency regulation impact banking sector",
    included_sources=[
        "federalreserve.gov",
        "sec.gov", 
        "reuters.com/business",
        "bloomberg.com",
        "imf.org"
    ],
    max_num_results=15
)

πŸ”¬ Medical Research

Access peer-reviewed medical literature and clinical data:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="immunotherapy cancer treatment efficacy",
    included_sources=[
        "valyu/valyu-pubmed",
        "nejm.org",
        "thelancet.com", 
        "nature.com/nm",
        "clinicaltrials.gov"
    ],
    search_type="proprietary"
)

πŸ’» Technical Documentation

Focus on official documentation and authoritative technical sources:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="Kubernetes security best practices RBAC",
    included_sources=[
        "kubernetes.io/docs",
        "docs.aws.amazon.com",
        "cloud.google.com/kubernetes-engine/docs"
    ],
    excluded_sources=[
        "stackoverflow.com",
        "medium.com",
        "dev.to"  # Exclude community blogs
    ]
)

πŸ“° News and Current Events

Get balanced news coverage while avoiding biased or unreliable sources:
from valyu import Valyu

valyu = Valyu(api_key="your-valyu-api-key")
response = valyu.search(
    query="artificial intelligence regulation European Union",
    included_sources=[
        "reuters.com",
        "bbc.com/news",
        "apnews.com", 
        "europa.eu",
        "politico.eu"
    ],
    excluded_sources=[
        "breitbart.com",
        "infowars.com",
        "rt.com"  # Exclude biased sources
    ]
)

Best Practices

Performance Tip: Source filtering can significantly improve response quality and relevance by focusing on authoritative, high-quality sources.

Strategic Source Selection

Research TypeRecommended SourcesSources to Exclude
Academic Researchvalyu/valyu-arxiv, nature.com, science.orgSocial media, forums, blog platforms
Technical DocumentationOfficial docs, developer.mozilla.orgstackoverflow.com, community blogs
Financial Analysisfederalreserve.gov, bloomberg.com, sec.govReddit, opinion blogs, social media
News & Current Eventsreuters.com, bbc.com, apnews.comBiased sources, tabloids, social media
If no relevant content is found in the included sources, the search will return 0 results. Included sources forces the search to only return content from the included sources.

Source Format Guidelines

Domain-Level Filtering:
# Include entire domain
included_sources = ["arxiv.org", "nature.com"]

# Exclude entire domain  
excluded_sources = ["reddit.com", "twitter.com"]
Path-Specific Filtering:
# Target specific sections
included_sources = [
    "docs.aws.amazon.com/lambda",  # Only Lambda docs
    "kubernetes.io/docs/concepts"  # Only concepts section
]
Dataset Access:
# Access Valyu's proprietary datasets
included_sources = [
    "valyu/valyu-arxiv",    # ArXiv papers
    "valyu/valyu-pubmed"    # Medical literature
]

Source filtering provides powerful control over your Valyu DeepSearch results, enabling precise content curation from trusted, authoritative sources while excluding noise and unreliable information.