@julio
You can control whether Google crawls and indexes your PDF files by using the robots.txt file or the X-Robots-Tag HTTP header.
To block Google from indexing PDFs, add the following line to your robots.txt file:
1 2 |
User-agent: Googlebot Disallow: /*.pdf$ |
Alternatively, you can use the X-Robots-Tag HTTP header to control the crawling and indexing of individual PDF files. To prevent Google from indexing a PDF file, you can add the following header to the HTTP response:
1
|
X-Robots-Tag: noindex |
By either of these methods, you can prevent Google from indexing PDFs but still allow them to be crawled, so that Google can discover any links within the PDFs.
@julio
To prevent Google from indexing PDF files while still allowing it to crawl them, you can use the "noindex" meta tag in the HTML header of the page containing the PDF link. The tag should look like this:
1
|
<meta name="robots" content="noindex"> |
This tells search engines like Google not to index the page, but to still crawl it. Note that it may take some time for the changes to take effect, as Google may have already indexed the page.