Allow vs Disallow in Robots.txt Explained

Two lines in a robots.txt file. One says Disallow, the other says Allow, and it's easy to assume they're simple opposites — block this, permit that. Then a folder you thought was fully blocked shows up in Search Console as crawled anyway, or a page you meant to keep open stops getting visited by Googlebot, and it's clear the relationship between the two directives is less obvious than it looks.

The confusion isn't really about syntax. It's about how these two rules interact when they overlap, what they do and don't control, and why a single misplaced slash can quietly change what gets crawled sitewide.

Quick Answer

Disallow tells crawlers not to request a path; Allow tells them they may, even inside a path that Disallow would otherwise block. When both rules could apply to the same URL, the most specific (longest) matching path wins, not whichever line comes first. Neither directive prevents a URL from being indexed — they only control crawling.

What do Allow and Disallow actually mean?

Both directives live inside a User-agent block in robots.txt and describe URL paths, not files or pages by name. Each one tells a specific crawler, or all crawlers, how to treat requests that match that path.

User-agent: * Disallow: /private/ # blocks everything under /private/ Allow: /private/press-kit.pdf # except this one file

In that example, the Allow line's path is longer and more specific than the Disallow line's, so /private/press-kit.pdf stays crawlable while the rest of /private/ stays blocked.

Why getting this right matters

Robots.txt sits at the very top of the crawling pipeline. A rule set wrong here doesn't just affect one page — it can silently reshape what a search engine sees across an entire section of a site.

📊 Quick stat A large share of robots.txt issues flagged in Search Console trace back to a Disallow rule that was written broader than intended — not to malformed syntax, since the format itself is only a few lines of plain text.

Step-by-step: writing Allow and Disallow rules correctly

  1. List every path that genuinely shouldn't be crawled. Think admin panels, internal search results, staging folders, or duplicate parameterized URLs — not pages you simply don't want ranking.
  2. Write the broadest Disallow rule first. Block the parent folder, like Disallow: /account/, rather than listing every single file inside it one by one.
  3. Add Allow rules only for real exceptions. If one file or subfolder inside a blocked path still needs to be crawled, add a more specific Allow line pointing directly at it.
  4. Double-check trailing slashes and wildcards. /blog and /blog/ match different sets of URLs, and a stray * can widen a rule far beyond what was intended.
  5. Place the file at the domain root. robots.txt only takes effect at https://yoursite.com/robots.txt — a copy anywhere else in the folder structure is ignored.
  6. Test specific URLs before publishing. Check the paths you most care about against the rule set to confirm each one resolves the way you expect.
  7. Re-check after every site restructure. New folders, renamed sections, or a CMS migration can leave old Disallow rules blocking paths that no longer exist, or missing new ones that should be blocked.
Try the Rebrixe Robots.txt Generator — free Build a correctly ordered Allow/Disallow rule set without memorizing the syntax.
Generate Robots.txt →

Common mistakes with Allow vs Disallow

1. Assuming Disallow removes a page from search results

Disallow only stops crawling. A blocked URL that's already linked from elsewhere can still appear in results — the correct way to keep something out of search entirely is a noindex tag or authentication, not robots.txt.

2. Blocking a folder without freeing the assets inside it

Disallowing a template or theme folder can accidentally block the CSS and JavaScript a page needs to render, causing Google to see a broken or incomplete layout during rendering.

3. Forgetting that paths are case-sensitive

Disallow: /Private/ does not block /private/. If a site uses inconsistent capitalization in its URLs, each variation needs its own line.

4. Writing an Allow rule that's less specific than the Disallow it's meant to override

Because the longest matching path wins, an Allow rule that's shorter or less precise than the competing Disallow rule simply won't take effect, and the block stays in place.

💡 Pro tip When in doubt about which rule wins for a given URL, check it directly in Search Console's robots.txt tester rather than reasoning through the paths by eye — path matching gets harder to track once wildcards are involved.

Real-world examples

How Allow and Disallow are typically combined for common site structures:

E-commerce site
Blocking cart, allowing product pages
Disallow: /cart/
Allow: /products/
Keeps checkout flow and account pages out of the crawl budget while product listings stay fully crawlable.
WordPress site
Blocking admin, allowing AJAX
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
A near-universal WordPress default: locks down the admin panel but frees the single endpoint many front-end scripts rely on.
Publisher site
Blocking search, allowing archives
Disallow: /?s=
(no Allow needed)
Blocks infinite internal search-result URLs from being crawled, without touching real article or category pages.
SaaS site
Blocking a staging subfolder
Disallow: /beta/
A single broad Disallow rule with no exceptions, used to keep an entire pre-release section out of the crawl entirely.

Allow vs Disallow compared

A side-by-side look at what each directive actually does, and where the common misconceptions creep in.

Aspect Disallow Allow
Primary function Blocks a path from being crawled Permits a path, overriding a broader Disallow
Needed by default? Only for paths you want blocked Optional — only for exceptions
Controls indexing? No — controls crawling only No — controls crawling only
Conflict resolution Loses to a more specific Allow rule Wins over a less specific Disallow rule
Respected by all bots? Only compliant crawlers Only compliant crawlers

Build your robots.txt file right now — free

The Rebrixe Robots.txt Generator handles path ordering and specificity for you — pick the folders to block, add exceptions where needed, and get a correctly structured file with Allow and Disallow rules in the right order.

Free Robots.txt Generator Add Allow/Disallow rules through a form, download a ready-to-upload file.
Open Robots.txt Generator →

Frequently asked questions

Disallow tells a crawler not to request a path, while Allow tells a crawler it may request a path even if a broader Disallow rule would otherwise block it. Allow exists to carve out exceptions inside a blocked folder, not to grant access that was never restricted in the first place.
No. By default, anything not matched by a Disallow rule is already crawlable, so an Allow line is only necessary when you need to override a broader Disallow rule for a specific file or subfolder.
Search engines that follow the robots.txt standard resolve conflicts by matching the longest, most specific path, not by which rule appears first in the file. A more specific Allow rule beats a shorter, more general Disallow rule for the same path.
Disallow blocks crawling, not indexing. A disallowed URL can still appear in search results, usually without a description, if other pages link to it, because Google can index a URL it has never crawled.
Yes. Paths in Allow and Disallow rules are matched exactly as written, so /Folder/ and /folder/ are treated as two different paths and need to be listed separately if both should be affected.
The core behavior is standardized and followed by major crawlers like Googlebot and Bingbot, but some smaller or less compliant bots ignore robots.txt entirely, so it should never be treated as a security or access-control mechanism.
Google Search Console's robots.txt report and URL Inspection tool both show whether a specific URL is blocked, and by which rule, which is more reliable than reading the file and guessing how the rules interact.

Build a correct robots.txt file in seconds

The Rebrixe Robots.txt Generator gets rule ordering and path specificity right by default — no account, no watermark, just a ready-to-upload file.

Launch the Robots.txt Generator →
← Back to blogs