R.I.P to Get-PnPSearchCrawlLog. Search as alternatives with gotchas. Not a replacement for crawl log.

February 19, 2026

TL;DR:

These scripts in the post use PnP.PowerShell to enumerate document libraries and compare file URLs against Search result to determine whether files are searchable. The plan was to show whether those files were indexable successfully, however the Get-PnPSearchCrawlLog is deprecated, see Deprecate Get-PnPSearchCrawlLog cmdlet for more details. This is because the underlying CSOM endpoint GetCrawledUrls has been deprecated.
Use the first script when you want a quick list of files that are present in a library but do not appear in Search results. The second script was a nice to have getting crawl log information which was working last week to determine indexing recency for further debugging, however it’s not relevant anymore.

Why this is useful

Administrators often find files that appear in the search index but are not returned via the Search API, or vice versa. That mismatch can indicate index freshness issues, excluded file types, or problems with the crawl pipeline.
Microsoft deprecated Get-PnPSearchCrawlLog in some contexts; you can still rely on Submit-PnPSearchQuery to determine whether files appear in Search results. It’s a replacement to be able to further debug search issues via crawl API.

Limitations / Caveats

Search-only checks do not prove freshness. A file may be indexed but with stale metadata — the scripts cannot reliably say when the index entry was created unless you query the crawl log which is not possible anymore.
Some file types are not returned by the Search Path property from this endpoint (for example images and certain Office formats). The scripts include an exclusion list to keep results cleaner.

Why observabilibity is important through the crawl log?

Kasper, another MVP specialising in Search shared his insights on You Can’t Fix What You Can’t See: A Step Back for Microsoft 365 Search.

My use cases using the Crawl log API

Essential for monitoring how much content is being crawled per hour or per day, especially during migrations. It helps us understand the load on the crawl subsystem and assess whether it may be impacting other features that depend on timely indexing, such as intranet news publishing. We’ve had incidents in the past where published news wasn’t visible to users even when it was business‑critical because the crawl was saturated. As a workaround, we rely on the CrawlLog API to “pulse” the crawl and confirm it isn’t overwhelmed by migrated content, ensuring that business‑as‑usual functionality like intranet remains unaffected.
It also allows us to verify the freshness of indexed data, for example by checking that the crawled date is later than the modified date. This makes it much easier to diagnose broader issues, such as files not appearing in search even after a site reindex. In fact, whenever we’ve raised Microsoft support tickets related to search, engineers have consistently advised us to use the CrawlLog API as the first diagnostic step.
Additionally, I’ve noticed that certain file types, such as JPEG, PNG, and MP3, are not returned from the Search API, even though I could see they were indexed and fully searchable through the SharePoint UI. This gap makes programmatic diagnostics even more dependent on the CrawlLog API, as it provides the only reliable way to validate whether these items were actually crawled and when. Microsoft should provide an alternative to the crawl log so we can diagnose issues transparently and maintain trust in the platform.

PowerShell Script

Prerequisites

Install the PnP.PowerShell module and authenticate with an app registration or interactive credentials that have access to the target sites.
Provide a CSV with a SiteUrl column containing the sites to scan.

Script: Find files which are not searchable

The first script enumerates visible document libraries in each site, enumerates files, runs a KQL Path: query scoped to the library and reports files that do not appear in Search results.

#Requires -Modules PnP.PowerShell
Clear-Host

# ===== Settings =====
$clientId     = "xxxxxxx"
$dateTime     = Get-Date -Format "yyyy-MM-dd-HH-mm-ss"
$tenantUrl    = "https://contoso.sharepoint.com"

# File extensions to exclude (case-insensitive) as they can't be searched using their Path metadata, e.g. Path:FileUrl
$ExcludedExtensions = @('.png', '.jpg', '.jpeg', '.xltx', '.one', '.onetoc2', '.gif','.mp4','.agent')

$invocation     = Get-Variable -Name MyInvocation -ValueOnly
$directoryPath  = Split-Path $invocation.MyCommand.Path
$csvPath        = Join-Path $directoryPath "sites1.csv"    # CSV must have a column 'SiteUrl'

# Ensure output folder exists
$outputFolder = Join-Path $directoryPath "output_files"
if (-not (Test-Path $outputFolder)) { New-Item -ItemType Directory -Path $outputFolder | Out-Null }
$outputCsv    = Join-Path $outputFolder ("SearchLog-Null-" + $dateTime + ".csv")

# Lists/libraries to exclude
$ExcludedLists = @(
    "Access Requests","App Packages","appdata","appfiles","Apps in Testing","Cache Profiles","Composed Looks",
    "Content and Structure Reports","Content type publishing error log","Converted Forms","Device Channels",
    "Form Templates","fpdatasources","Get started with Apps for Office and SharePoint","List Template Gallery",
    "Long Running Operation Status","Maintenance Log Library","Images","site collection images","Master Docs",
    "Master Page Gallery","MicroFeed","NintexFormXml","Quick Deploy Items","Relationships List","Reusable Content",
    "Reporting Metadata","Reporting Templates","Search Config List","Site Assets","Preservation Hold Library",
    "Site Pages","Solution Gallery","Style Library","Suggested Content Browser Locations","Theme Gallery",
    "TaxonomyHiddenList","User Information List","Web Part Gallery","wfpub","wfsvc","Workflow History",
    "Workflow Tasks","Pages"
)

# ===== Safety checks =====
if (-not (Test-Path $csvPath)) {
    Write-Error "CSV not found at $csvPath. Ensure it exists and includes a 'SiteUrl' column."
    exit 1
}

# ===== Helpers =====
function Normalize-Url {
    param([string]$Url)
    if ([string]::IsNullOrWhiteSpace($Url)) { return $null }
    return ($Url.Trim().TrimEnd('/') ).ToLowerInvariant()
}
function Get-UrlVariants {
    param([string]$Url)
    if ([string]::IsNullOrWhiteSpace($Url)) { return @() }
    $u = $Url.Trim()
    $variants = New-Object System.Collections.Generic.List[string]
    $variants.Add((Normalize-Url $u))
    # Add encoded/decode space variants
    $variants.Add((Normalize-Url ($u -replace ' ', '%20')))
    $variants.Add((Normalize-Url ($u -replace '%20', ' ')))
    $variants | Where-Object { $_ } | Select-Object -Unique
}

# ===== Collect results =====
$results = New-Object System.Collections.Generic.List[object]
$sites   = Import-Csv -Path $csvPath   # expects column "SiteUrl"

foreach ($s in $sites) {
    $siteUrl = $s.SiteUrl
    if ([string]::IsNullOrWhiteSpace($siteUrl)) { continue }

    Write-Host "Connecting to site: $siteUrl" -ForegroundColor Cyan

    try {
        # Connect interactively with the client ID
        Connect-PnPOnline -ClientId $clientId -Url $siteUrl -Interactive

        # Get only visible document libraries (exclude hidden/system libraries)
        $libraries = Get-PnPList -Includes BaseType, BaseTemplate, Hidden, Title, ItemCount, RootFolder `
        | Where-Object {
                $_.Hidden -eq $false -and
                $_.BaseType -eq "DocumentLibrary" -and
                $_.Title -notin $ExcludedLists
            }

        foreach ($library in $libraries) {
            $libraryAbsUrl = ($tenantUrl.TrimEnd('/')) + $library.RootFolder.ServerRelativeUrl
            Write-Host "  Library: $($library.Title)" -ForegroundColor Yellow

            # Pull only fields we need and page for large lists
            $listItems = Get-PnPListItem -List $library -PageSize 500 `
                                         -Fields "FileRef","FSObjType"  `
                                         -ErrorAction SilentlyContinue

            # ==== SEARCH RESULTS (library scope) ====
            $kql = "Path:`"$libraryAbsUrl`""
            $searchresults = $null
            try {
                $searchresults = Submit-PnPSearchQuery `
                    -Query $kql `
                    -All `
                    -SelectProperties @("Title","Path","LastModifiedTime") `
                    -SortList @{ "LastModifiedTime" = "Descending" } `
                    -ErrorAction SilentlyContinue
            } catch {}

            # Build a fast lookup of paths from search results
            $searchPathSet = New-Object 'System.Collections.Generic.HashSet[string]'
            if ($searchresults) {
                $searchRows = @()
                if ($searchresults.ResultRows) { $searchRows = $searchresults.ResultRows }

                foreach ($row in $searchRows) {
                    $p = $null
                    if ($row -is [System.Collections.IDictionary])      { $p = [string]$row["Path"] }
                    elseif ($row.PSObject.Properties.Match("Path"))     { $p = [string]$row.Path }
                    if ($p) {
                        # OPTIONAL: skip excluded extensions to keep the set cleaner
                        $ext = [System.IO.Path]::GetExtension($p)
                        if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }
                        $null = $searchPathSet.Add((Normalize-Url $p))
                    }
                }
            }

            # ==== Evaluate each file ====
            foreach ($item in $listItems) {
                # FSObjType: 0=file, 1=folder
                if ($item.FieldValues["FSObjType"] -ne 0) { continue }

                $serverRelative = $item.FieldValues["FileRef"]
                if ([string]::IsNullOrWhiteSpace($serverRelative)) { continue }

                # NEW: Skip unwanted extensions up front
                $ext = [System.IO.Path]::GetExtension($serverRelative)
                if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }

                $fullUrl = ($tenantUrl.TrimEnd('/')) + $serverRelative
                $urlVariants = Get-UrlVariants -Url $fullUrl

                # SEARCHABLE? (if any variant appears in search results)
                $searchable = "No"
                foreach ($v in $urlVariants) {
                    if ($searchPathSet.Contains($v)) { $searchable = "Yes"; break }
                }

                if (($searchable -eq "No")) { 
                    $results.Add([pscustomobject]@{
                        SiteUrl                = $siteUrl
                        LibraryTitle           = $library.Title
                        LibraryUrl             = $libraryAbsUrl
                        FileServerRelativePath = $serverRelative
                        FullUrl                = $fullUrl
                        Searchable             = $searchable
                    })
                }
            }
        }
    }
    catch {
        Write-Warning "Failed on site $siteUrl. Error: $($_.Exception.Message)"
        continue
    }
}

# ===== Export =====
$results | Export-Csv -Path $outputCsv -NoTypeInformation -Encoding UTF8
Write-Host "Export complete: $outputCsv" -ForegroundColor Green

Sample output

Not Searchable

Script: Find if files are not indexed or Searchable (nice to have with crawl log - Obsolete)

This second script queries the crawl log for the library and compares crawl timestamps to identify whether files have been indexed and when. This was working last week and now no longer usable with the deprecating of the underling API , hence do not use.

#Requires -Modules PnP.PowerShell
Clear-Host

# ===== Settings =====
$clientId     = "xxxxxx"
$dateTime     = Get-Date -Format "yyyy-MM-dd-HH-mm-ss"
$tenantUrl    = "https://contoso.sharepoint.com"

# NEW: File extensions to exclude (case-insensitive) as they can't be searched using their Path metadata, e.g. Path:FileUrl
$ExcludedExtensions = @('.png', '.jpg', '.jpeg', '.xltx', '.one', '.onetoc2', '.gif','.mp4','.agent')

$invocation     = Get-Variable -Name MyInvocation -ValueOnly
$directoryPath  = Split-Path $invocation.MyCommand.Path
$csvPath        = Join-Path $directoryPath "sites1.csv"   # CSV must have a column 'SiteUrl' containing a list of site urls

# Ensure output folder exists
$outputFolder = Join-Path $directoryPath "output_files"
if (-not (Test-Path $outputFolder)) { New-Item -ItemType Directory -Path $outputFolder | Out-Null }
$outputCsv    = Join-Path $outputFolder ("NonSearchableIndexable-" + $dateTime + ".csv")

# Lists/libraries to exclude
$ExcludedLists = @(
    "Access Requests","App Packages","appdata","appfiles","Apps in Testing","Cache Profiles","Composed Looks",
    "Content and Structure Reports","Content type publishing error log","Converted Forms","Device Channels",
    "Form Templates","fpdatasources","Get started with Apps for Office and SharePoint","List Template Gallery",
    "Long Running Operation Status","Maintenance Log Library","Images","site collection images","Master Docs",
    "Master Page Gallery","MicroFeed","NintexFormXml","Quick Deploy Items","Relationships List","Reusable Content",
    "Reporting Metadata","Reporting Templates","Search Config List","Site Assets","Preservation Hold Library",
    "Site Pages","Solution Gallery","Style Library","Suggested Content Browser Locations","Theme Gallery",
    "TaxonomyHiddenList","User Information List","Web Part Gallery","wfpub","wfsvc","Workflow History",
    "Workflow Tasks","Pages"
)

# ===== Safety checks =====
if (-not (Test-Path $csvPath)) {
    Write-Error "CSV not found at $csvPath. Ensure it exists and includes a 'SiteUrl' column."
    exit 1
}

# ===== Helpers =====
function Normalize-Url {
    param([string]$Url)
    if ([string]::IsNullOrWhiteSpace($Url)) { return $null }
    return ($Url.Trim().TrimEnd('/') ).ToLowerInvariant()
}
function Get-UrlVariants {
    param([string]$Url)
    if ([string]::IsNullOrWhiteSpace($Url)) { return @() }
    $u = $Url.Trim()
    $variants = New-Object System.Collections.Generic.List[string]
    $variants.Add((Normalize-Url $u))
    # Add encoded/decode space variants
    $variants.Add((Normalize-Url ($u -replace ' ', '%20')))
    $variants.Add((Normalize-Url ($u -replace '%20', ' ')))
    $variants | Where-Object { $_ } | Select-Object -Unique
}

# ===== Collect results =====
$results = New-Object System.Collections.Generic.List[object]
$sites   = Import-Csv -Path $csvPath   # expects column "SiteUrl"

foreach ($s in $sites) {
    $siteUrl = $s.SiteUrl
    if ([string]::IsNullOrWhiteSpace($siteUrl)) { continue }

    Write-Host "Connecting to site: $siteUrl" -ForegroundColor Cyan

    try {
        # Connect interactively with the client ID
        Connect-PnPOnline -ClientId $clientId -Url $siteUrl -Interactive

        # Get only visible document libraries (exclude hidden/system libraries)
        $libraries = Get-PnPList -Includes BaseType, BaseTemplate, Hidden, Title, ItemCount, RootFolder `
        | Where-Object {
                $_.Hidden -eq $false -and
                $_.BaseType -eq "DocumentLibrary" -and
                $_.Title -notin $ExcludedLists
            }

        foreach ($library in $libraries) {
            $libraryAbsUrl = ($tenantUrl.TrimEnd('/')) + $library.RootFolder.ServerRelativeUrl
            Write-Host "  Library: $($library.Title)" -ForegroundColor Yellow

            # Pull only fields we need and page for large lists
            $listItems = Get-PnPListItem -List $library -PageSize 500 `
                                         -Fields "FileRef","FSObjType"  `
                                         -ErrorAction SilentlyContinue

            # ==== SEARCH RESULTS (library scope) ====
            $kql = "Path:`"$libraryAbsUrl`""
            $searchresults = $null
            try {
                $searchresults = Submit-PnPSearchQuery `
                    -Query $kql `
                    -All `
                    -SelectProperties @("Title","Path","LastModifiedTime") `
                    -SortList @{ "LastModifiedTime" = "Descending" } `
                    -ErrorAction SilentlyContinue
            } catch {}

            # Build a fast lookup of paths from search results
            $searchPathSet = New-Object 'System.Collections.Generic.HashSet[string]'
            if ($searchresults) {
                $searchRows = @()
                if ($searchresults.ResultRows) { $searchRows = $searchresults.ResultRows }

                foreach ($row in $searchRows) {
                    $p = $null
                    if ($row -is [System.Collections.IDictionary])      { $p = [string]$row["Path"] }
                    elseif ($row.PSObject.Properties.Match("Path"))     { $p = [string]$row.Path }
                    if ($p) {
                        # OPTIONAL: skip excluded extensions to keep the set cleaner
                        $ext = [System.IO.Path]::GetExtension($p)
                        if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }
                        $null = $searchPathSet.Add((Normalize-Url $p))
                    }
                }
            }

            # ==== CRAWL LOG (library scope) ====
            $crawlresults = $null
            $crawlMap = @{}   # url (normalized) -> [DateTime] max last indexed time
            try {
                $crawlresults = Get-PnPSearchCrawlLog -Filter $libraryAbsUrl -RowLimit (($library.ItemCount * 2)+10)
                if ($crawlresults) {
                    foreach ($cr in $crawlresults) {
                        $urlVal = $cr.Url
                        if (-not $urlVal) { continue }

                        # OPTIONAL: skip excluded extensions here as well
                        $ext = [System.IO.Path]::GetExtension($urlVal)
                        if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }

                        $lastIdx = $null
                        try { $lastIdx = [datetime]$cr.CrawlTime } catch {}

                        $nUrl = Normalize-Url $urlVal
                        if ($nUrl) {
                            if (-not $crawlMap.ContainsKey($nUrl)) {
                                $crawlMap[$nUrl] = $lastIdx
                            } else {
                                if ($lastIdx -and $crawlMap[$nUrl] -and ($lastIdx -gt $crawlMap[$nUrl])) {
                                    $crawlMap[$nUrl] = $lastIdx
                                } elseif ($lastIdx -and -not $crawlMap[$nUrl]) {
                                    $crawlMap[$nUrl] = $lastIdx
                                }
                            }
                        }
                    }
                }
            } catch {
                Write-Verbose "Crawl log query failed for $libraryAbsUrl : $($_.Exception.Message)"
            }

            # ==== Evaluate each file ====
            foreach ($item in $listItems) {
                # FSObjType: 0=file, 1=folder
                if ($item.FieldValues["FSObjType"] -ne 0) { continue }

                $serverRelative = $item.FieldValues["FileRef"]
                if ([string]::IsNullOrWhiteSpace($serverRelative)) { continue }

                # NEW: Skip unwanted extensions up front
                $ext = [System.IO.Path]::GetExtension($serverRelative)
                if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }

                $fullUrl = ($tenantUrl.TrimEnd('/')) + $serverRelative
                $urlVariants = Get-UrlVariants -Url $fullUrl

                # SEARCHABLE? (if any variant appears in search results)
                $searchable = "No"
                foreach ($v in $urlVariants) {
                    if ($searchPathSet.Contains($v)) { $searchable = "Yes"; break }
                }

                # INDEXED? (if any variant appears in crawl log map)
                $indexed = "No"
                $lastIndexedTime = $null
                foreach ($v in $urlVariants) {
                    if ($crawlMap.ContainsKey($v)) {
                        $indexed = "Yes"
                        $lastIndexedTime = $crawlMap[$v]
                        break
                    }
                }

                if (!($indexed -eq "Yes" -and $searchable -eq "Yes")) { 
                    $results.Add([pscustomobject]@{
                        SiteUrl                = $siteUrl
                        LibraryTitle           = $library.Title
                        LibraryUrl             = $libraryAbsUrl
                        FileServerRelativePath = $serverRelative
                        FullUrl                = $fullUrl
                        Indexed                = $indexed
                        LastIndexedTime        = $lastIndexedTime
                        Searchable             = $searchable
                    })
                }
            }
        }
    }
    catch {
        Write-Warning "Failed on site $siteUrl. Error: $($_.Exception.Message)"
        continue
    }
}
# ===== Export =====
$results | Export-Csv -Path $outputCsv -NoTypeInformation -Encoding UTF8
Write-Host "Export complete: $outputCsv" -ForegroundColor Green

Using the output

Sample output

Files searchable and indexable

The scripts export CSV files into an output_files folder next to the script. Use Excel or PowerShell to filter by Searchable/Indexed(Obsolete) to prioritise investigation.

Troubleshooting

If Get-PnPSearchCrawlLog fails or returns no rows, this is because the underlying APIs have been deprecated as mentioned above.

References

You Can’t Fix What You Can’t See: A Step Back for Microsoft 365 Search

Debugging SharePoint Search with PnP PowerShell and Crawl Logs