R.I.P to Get-PnPSearchCrawlLog. Search as alternatives with gotchas. Not a replacement for crawl log.
TL;DR:
These scripts in the post use PnP.PowerShell to enumerate document libraries and compare file URLs against Search result to determine whether files are searchable. The plan was to show whether those files were indexable successfully, however the
Get-PnPSearchCrawlLogis deprecated, see Deprecate Get-PnPSearchCrawlLog cmdlet for more details. This is because the underlying CSOM endpointGetCrawledUrlshas been deprecated.Use the first script when you want a quick list of files that are present in a library but do not appear in Search results. The second script was a nice to have getting crawl log information which was working last week to determine indexing recency for further debugging, however it’s not relevant anymore.
Why this is useful
- Administrators often find files that appear in the search index but are not returned via the Search API, or vice versa. That mismatch can indicate index freshness issues, excluded file types, or problems with the crawl pipeline.
- Microsoft deprecated
Get-PnPSearchCrawlLogin some contexts; you can still rely onSubmit-PnPSearchQueryto determine whether files appear in Search results. It’s a replacement to be able to further debug search issues via crawl API.
Limitations / Caveats
- Search-only checks do not prove freshness. A file may be indexed but with stale metadata — the scripts cannot reliably say when the index entry was created unless you query the crawl log which is not possible anymore.
- Some file types are not returned by the Search
Pathproperty from this endpoint (for example images and certain Office formats). The scripts include an exclusion list to keep results cleaner.
Why observabilibity is important through the crawl log?
Kasper, another MVP specialising in Search shared his insights on You Can’t Fix What You Can’t See: A Step Back for Microsoft 365 Search.
My use cases using the Crawl log API
Essential for monitoring how much content is being crawled per hour or per day, especially during migrations. It helps us understand the load on the crawl subsystem and assess whether it may be impacting other features that depend on timely indexing, such as intranet news publishing. We’ve had incidents in the past where published news wasn’t visible to users even when it was business‑critical because the crawl was saturated. As a workaround, we rely on the CrawlLog API to “pulse” the crawl and confirm it isn’t overwhelmed by migrated content, ensuring that business‑as‑usual functionality like intranet remains unaffected.
It also allows us to verify the freshness of indexed data, for example by checking that the crawled date is later than the modified date. This makes it much easier to diagnose broader issues, such as files not appearing in search even after a site reindex. In fact, whenever we’ve raised Microsoft support tickets related to search, engineers have consistently advised us to use the CrawlLog API as the first diagnostic step.
Additionally, I’ve noticed that certain file types, such as JPEG, PNG, and MP3, are not returned from the Search API, even though I could see they were indexed and fully searchable through the SharePoint UI. This gap makes programmatic diagnostics even more dependent on the CrawlLog API, as it provides the only reliable way to validate whether these items were actually crawled and when.
Microsoft should provide an alternative to the crawl log so we can diagnose issues transparently and maintain trust in the platform.
PowerShell Script
Prerequisites
- Install the PnP.PowerShell module and authenticate with an app registration or interactive credentials that have access to the target sites.
- Provide a CSV with a
SiteUrlcolumn containing the sites to scan.
Script: Find files which are not searchable
The first script enumerates visible document libraries in each site, enumerates files, runs a KQL Path: query scoped to the library and reports files that do not appear in Search results.
#Requires -Modules PnP.PowerShell
Clear-Host
# ===== Settings =====
$clientId = "xxxxxxx"
$dateTime = Get-Date -Format "yyyy-MM-dd-HH-mm-ss"
$tenantUrl = "https://contoso.sharepoint.com"
# File extensions to exclude (case-insensitive) as they can't be searched using their Path metadata, e.g. Path:FileUrl
$ExcludedExtensions = @('.png', '.jpg', '.jpeg', '.xltx', '.one', '.onetoc2', '.gif','.mp4','.agent')
$invocation = Get-Variable -Name MyInvocation -ValueOnly
$directoryPath = Split-Path $invocation.MyCommand.Path
$csvPath = Join-Path $directoryPath "sites1.csv" # CSV must have a column 'SiteUrl'
# Ensure output folder exists
$outputFolder = Join-Path $directoryPath "output_files"
if (-not (Test-Path $outputFolder)) { New-Item -ItemType Directory -Path $outputFolder | Out-Null }
$outputCsv = Join-Path $outputFolder ("SearchLog-Null-" + $dateTime + ".csv")
# Lists/libraries to exclude
$ExcludedLists = @(
"Access Requests","App Packages","appdata","appfiles","Apps in Testing","Cache Profiles","Composed Looks",
"Content and Structure Reports","Content type publishing error log","Converted Forms","Device Channels",
"Form Templates","fpdatasources","Get started with Apps for Office and SharePoint","List Template Gallery",
"Long Running Operation Status","Maintenance Log Library","Images","site collection images","Master Docs",
"Master Page Gallery","MicroFeed","NintexFormXml","Quick Deploy Items","Relationships List","Reusable Content",
"Reporting Metadata","Reporting Templates","Search Config List","Site Assets","Preservation Hold Library",
"Site Pages","Solution Gallery","Style Library","Suggested Content Browser Locations","Theme Gallery",
"TaxonomyHiddenList","User Information List","Web Part Gallery","wfpub","wfsvc","Workflow History",
"Workflow Tasks","Pages"
)
# ===== Safety checks =====
if (-not (Test-Path $csvPath)) {
Write-Error "CSV not found at $csvPath. Ensure it exists and includes a 'SiteUrl' column."
exit 1
}
# ===== Helpers =====
function Normalize-Url {
param([string]$Url)
if ([string]::IsNullOrWhiteSpace($Url)) { return $null }
return ($Url.Trim().TrimEnd('/') ).ToLowerInvariant()
}
function Get-UrlVariants {
param([string]$Url)
if ([string]::IsNullOrWhiteSpace($Url)) { return @() }
$u = $Url.Trim()
$variants = New-Object System.Collections.Generic.List[string]
$variants.Add((Normalize-Url $u))
# Add encoded/decode space variants
$variants.Add((Normalize-Url ($u -replace ' ', '%20')))
$variants.Add((Normalize-Url ($u -replace '%20', ' ')))
$variants | Where-Object { $_ } | Select-Object -Unique
}
# ===== Collect results =====
$results = New-Object System.Collections.Generic.List[object]
$sites = Import-Csv -Path $csvPath # expects column "SiteUrl"
foreach ($s in $sites) {
$siteUrl = $s.SiteUrl
if ([string]::IsNullOrWhiteSpace($siteUrl)) { continue }
Write-Host "Connecting to site: $siteUrl" -ForegroundColor Cyan
try {
# Connect interactively with the client ID
Connect-PnPOnline -ClientId $clientId -Url $siteUrl -Interactive
# Get only visible document libraries (exclude hidden/system libraries)
$libraries = Get-PnPList -Includes BaseType, BaseTemplate, Hidden, Title, ItemCount, RootFolder `
| Where-Object {
$_.Hidden -eq $false -and
$_.BaseType -eq "DocumentLibrary" -and
$_.Title -notin $ExcludedLists
}
foreach ($library in $libraries) {
$libraryAbsUrl = ($tenantUrl.TrimEnd('/')) + $library.RootFolder.ServerRelativeUrl
Write-Host " Library: $($library.Title)" -ForegroundColor Yellow
# Pull only fields we need and page for large lists
$listItems = Get-PnPListItem -List $library -PageSize 500 `
-Fields "FileRef","FSObjType" `
-ErrorAction SilentlyContinue
# ==== SEARCH RESULTS (library scope) ====
$kql = "Path:`"$libraryAbsUrl`""
$searchresults = $null
try {
$searchresults = Submit-PnPSearchQuery `
-Query $kql `
-All `
-SelectProperties @("Title","Path","LastModifiedTime") `
-SortList @{ "LastModifiedTime" = "Descending" } `
-ErrorAction SilentlyContinue
} catch {}
# Build a fast lookup of paths from search results
$searchPathSet = New-Object 'System.Collections.Generic.HashSet[string]'
if ($searchresults) {
$searchRows = @()
if ($searchresults.ResultRows) { $searchRows = $searchresults.ResultRows }
foreach ($row in $searchRows) {
$p = $null
if ($row -is [System.Collections.IDictionary]) { $p = [string]$row["Path"] }
elseif ($row.PSObject.Properties.Match("Path")) { $p = [string]$row.Path }
if ($p) {
# OPTIONAL: skip excluded extensions to keep the set cleaner
$ext = [System.IO.Path]::GetExtension($p)
if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }
$null = $searchPathSet.Add((Normalize-Url $p))
}
}
}
# ==== Evaluate each file ====
foreach ($item in $listItems) {
# FSObjType: 0=file, 1=folder
if ($item.FieldValues["FSObjType"] -ne 0) { continue }
$serverRelative = $item.FieldValues["FileRef"]
if ([string]::IsNullOrWhiteSpace($serverRelative)) { continue }
# NEW: Skip unwanted extensions up front
$ext = [System.IO.Path]::GetExtension($serverRelative)
if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }
$fullUrl = ($tenantUrl.TrimEnd('/')) + $serverRelative
$urlVariants = Get-UrlVariants -Url $fullUrl
# SEARCHABLE? (if any variant appears in search results)
$searchable = "No"
foreach ($v in $urlVariants) {
if ($searchPathSet.Contains($v)) { $searchable = "Yes"; break }
}
if (($searchable -eq "No")) {
$results.Add([pscustomobject]@{
SiteUrl = $siteUrl
LibraryTitle = $library.Title
LibraryUrl = $libraryAbsUrl
FileServerRelativePath = $serverRelative
FullUrl = $fullUrl
Searchable = $searchable
})
}
}
}
}
catch {
Write-Warning "Failed on site $siteUrl. Error: $($_.Exception.Message)"
continue
}
}
# ===== Export =====
$results | Export-Csv -Path $outputCsv -NoTypeInformation -Encoding UTF8
Write-Host "Export complete: $outputCsv" -ForegroundColor Green
Sample output

Script: Find if files are not indexed or Searchable (nice to have with crawl log - Obsolete)
This second script queries the crawl log for the library and compares crawl timestamps to identify whether files have been indexed and when. This was working last week and now no longer usable with the deprecating of the underling API , hence do not use.
#Requires -Modules PnP.PowerShell
Clear-Host
# ===== Settings =====
$clientId = "xxxxxx"
$dateTime = Get-Date -Format "yyyy-MM-dd-HH-mm-ss"
$tenantUrl = "https://contoso.sharepoint.com"
# NEW: File extensions to exclude (case-insensitive) as they can't be searched using their Path metadata, e.g. Path:FileUrl
$ExcludedExtensions = @('.png', '.jpg', '.jpeg', '.xltx', '.one', '.onetoc2', '.gif','.mp4','.agent')
$invocation = Get-Variable -Name MyInvocation -ValueOnly
$directoryPath = Split-Path $invocation.MyCommand.Path
$csvPath = Join-Path $directoryPath "sites1.csv" # CSV must have a column 'SiteUrl' containing a list of site urls
# Ensure output folder exists
$outputFolder = Join-Path $directoryPath "output_files"
if (-not (Test-Path $outputFolder)) { New-Item -ItemType Directory -Path $outputFolder | Out-Null }
$outputCsv = Join-Path $outputFolder ("NonSearchableIndexable-" + $dateTime + ".csv")
# Lists/libraries to exclude
$ExcludedLists = @(
"Access Requests","App Packages","appdata","appfiles","Apps in Testing","Cache Profiles","Composed Looks",
"Content and Structure Reports","Content type publishing error log","Converted Forms","Device Channels",
"Form Templates","fpdatasources","Get started with Apps for Office and SharePoint","List Template Gallery",
"Long Running Operation Status","Maintenance Log Library","Images","site collection images","Master Docs",
"Master Page Gallery","MicroFeed","NintexFormXml","Quick Deploy Items","Relationships List","Reusable Content",
"Reporting Metadata","Reporting Templates","Search Config List","Site Assets","Preservation Hold Library",
"Site Pages","Solution Gallery","Style Library","Suggested Content Browser Locations","Theme Gallery",
"TaxonomyHiddenList","User Information List","Web Part Gallery","wfpub","wfsvc","Workflow History",
"Workflow Tasks","Pages"
)
# ===== Safety checks =====
if (-not (Test-Path $csvPath)) {
Write-Error "CSV not found at $csvPath. Ensure it exists and includes a 'SiteUrl' column."
exit 1
}
# ===== Helpers =====
function Normalize-Url {
param([string]$Url)
if ([string]::IsNullOrWhiteSpace($Url)) { return $null }
return ($Url.Trim().TrimEnd('/') ).ToLowerInvariant()
}
function Get-UrlVariants {
param([string]$Url)
if ([string]::IsNullOrWhiteSpace($Url)) { return @() }
$u = $Url.Trim()
$variants = New-Object System.Collections.Generic.List[string]
$variants.Add((Normalize-Url $u))
# Add encoded/decode space variants
$variants.Add((Normalize-Url ($u -replace ' ', '%20')))
$variants.Add((Normalize-Url ($u -replace '%20', ' ')))
$variants | Where-Object { $_ } | Select-Object -Unique
}
# ===== Collect results =====
$results = New-Object System.Collections.Generic.List[object]
$sites = Import-Csv -Path $csvPath # expects column "SiteUrl"
foreach ($s in $sites) {
$siteUrl = $s.SiteUrl
if ([string]::IsNullOrWhiteSpace($siteUrl)) { continue }
Write-Host "Connecting to site: $siteUrl" -ForegroundColor Cyan
try {
# Connect interactively with the client ID
Connect-PnPOnline -ClientId $clientId -Url $siteUrl -Interactive
# Get only visible document libraries (exclude hidden/system libraries)
$libraries = Get-PnPList -Includes BaseType, BaseTemplate, Hidden, Title, ItemCount, RootFolder `
| Where-Object {
$_.Hidden -eq $false -and
$_.BaseType -eq "DocumentLibrary" -and
$_.Title -notin $ExcludedLists
}
foreach ($library in $libraries) {
$libraryAbsUrl = ($tenantUrl.TrimEnd('/')) + $library.RootFolder.ServerRelativeUrl
Write-Host " Library: $($library.Title)" -ForegroundColor Yellow
# Pull only fields we need and page for large lists
$listItems = Get-PnPListItem -List $library -PageSize 500 `
-Fields "FileRef","FSObjType" `
-ErrorAction SilentlyContinue
# ==== SEARCH RESULTS (library scope) ====
$kql = "Path:`"$libraryAbsUrl`""
$searchresults = $null
try {
$searchresults = Submit-PnPSearchQuery `
-Query $kql `
-All `
-SelectProperties @("Title","Path","LastModifiedTime") `
-SortList @{ "LastModifiedTime" = "Descending" } `
-ErrorAction SilentlyContinue
} catch {}
# Build a fast lookup of paths from search results
$searchPathSet = New-Object 'System.Collections.Generic.HashSet[string]'
if ($searchresults) {
$searchRows = @()
if ($searchresults.ResultRows) { $searchRows = $searchresults.ResultRows }
foreach ($row in $searchRows) {
$p = $null
if ($row -is [System.Collections.IDictionary]) { $p = [string]$row["Path"] }
elseif ($row.PSObject.Properties.Match("Path")) { $p = [string]$row.Path }
if ($p) {
# OPTIONAL: skip excluded extensions to keep the set cleaner
$ext = [System.IO.Path]::GetExtension($p)
if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }
$null = $searchPathSet.Add((Normalize-Url $p))
}
}
}
# ==== CRAWL LOG (library scope) ====
$crawlresults = $null
$crawlMap = @{} # url (normalized) -> [DateTime] max last indexed time
try {
$crawlresults = Get-PnPSearchCrawlLog -Filter $libraryAbsUrl -RowLimit (($library.ItemCount * 2)+10)
if ($crawlresults) {
foreach ($cr in $crawlresults) {
$urlVal = $cr.Url
if (-not $urlVal) { continue }
# OPTIONAL: skip excluded extensions here as well
$ext = [System.IO.Path]::GetExtension($urlVal)
if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }
$lastIdx = $null
try { $lastIdx = [datetime]$cr.CrawlTime } catch {}
$nUrl = Normalize-Url $urlVal
if ($nUrl) {
if (-not $crawlMap.ContainsKey($nUrl)) {
$crawlMap[$nUrl] = $lastIdx
} else {
if ($lastIdx -and $crawlMap[$nUrl] -and ($lastIdx -gt $crawlMap[$nUrl])) {
$crawlMap[$nUrl] = $lastIdx
} elseif ($lastIdx -and -not $crawlMap[$nUrl]) {
$crawlMap[$nUrl] = $lastIdx
}
}
}
}
}
} catch {
Write-Verbose "Crawl log query failed for $libraryAbsUrl : $($_.Exception.Message)"
}
# ==== Evaluate each file ====
foreach ($item in $listItems) {
# FSObjType: 0=file, 1=folder
if ($item.FieldValues["FSObjType"] -ne 0) { continue }
$serverRelative = $item.FieldValues["FileRef"]
if ([string]::IsNullOrWhiteSpace($serverRelative)) { continue }
# NEW: Skip unwanted extensions up front
$ext = [System.IO.Path]::GetExtension($serverRelative)
if ($ext -and ($ExcludedExtensions -contains $ext.ToLower())) { continue }
$fullUrl = ($tenantUrl.TrimEnd('/')) + $serverRelative
$urlVariants = Get-UrlVariants -Url $fullUrl
# SEARCHABLE? (if any variant appears in search results)
$searchable = "No"
foreach ($v in $urlVariants) {
if ($searchPathSet.Contains($v)) { $searchable = "Yes"; break }
}
# INDEXED? (if any variant appears in crawl log map)
$indexed = "No"
$lastIndexedTime = $null
foreach ($v in $urlVariants) {
if ($crawlMap.ContainsKey($v)) {
$indexed = "Yes"
$lastIndexedTime = $crawlMap[$v]
break
}
}
if (!($indexed -eq "Yes" -and $searchable -eq "Yes")) {
$results.Add([pscustomobject]@{
SiteUrl = $siteUrl
LibraryTitle = $library.Title
LibraryUrl = $libraryAbsUrl
FileServerRelativePath = $serverRelative
FullUrl = $fullUrl
Indexed = $indexed
LastIndexedTime = $lastIndexedTime
Searchable = $searchable
})
}
}
}
}
catch {
Write-Warning "Failed on site $siteUrl. Error: $($_.Exception.Message)"
continue
}
}
# ===== Export =====
$results | Export-Csv -Path $outputCsv -NoTypeInformation -Encoding UTF8
Write-Host "Export complete: $outputCsv" -ForegroundColor Green
Using the output
Sample output

- The scripts export CSV files into an
output_filesfolder next to the script. Use Excel or PowerShell to filter bySearchable/Indexed(Obsolete) to prioritise investigation.
Troubleshooting
- If
Get-PnPSearchCrawlLogfails or returns no rows, this is because the underlying APIs have been deprecated as mentioned above.
References
You Can’t Fix What You Can’t See: A Step Back for Microsoft 365 Search
Debugging SharePoint Search with PnP PowerShell and Crawl Logs