Patrick Nelson

FYI, re my point above: Submitted an issue and open to comment/discussion here:

Show 1 attachment(s)

Requests to /child (if a page at that URL doesn't exist, has never existed and/or has since been archived) should result in a 404 error. For example, right now SilverStripe will automatically redirect /child to a random page in the database nested anywhere within the SiteTree hierarchy simply because that page also happens to have child as URLSegment. For example, if the page /path/to/child coincidentally does happen to exist then the current functionality would result in being redirected to that page.

This is problematic for several reasons, especially in websites with a very large number of pages like mine and especially when there's no logical connection between the root-level URL and the URL nested at some deeper level. For example:

• We could have 2 (or 5, or even 10) totally unrelated contact-us pages and this could redirect to a seemingly random one when in reality it should simply 404. • This is also undesirable because the 301 (permanent) redirect will vary depending on which Page returned coincidentally happens to have a smaller Sort value (given the use of -&gt;First() and the framework's default sorting that occurs on SiteTree). Should a user erroneously reference <|> because it happened work at the time, this would negatively impact user experience if a new higher Sort contact-us page happens to get created later on.

This functionality is a result primarily of how OldPageRedirector::find_old_page() works by not providing a ParentID value and simply allowing any page that happens to be in the result to be used and returning a completely irrelevant page . Also, this functionality is duplicated partially in SiteTree::get_by_link() with different code (and somewhat differing logic). This also makes it difficult to override due to the amount of code in the aforementioned ::find_old_page() method and due to the static references in SiteTree::get_by_link() (bypassing SilverStripe's IoC using Injector).

Proposed solution: At minimum, I believe we should add a configuration variable that would allow ensuring that we could toggle this "legacy" functionality. For least impact and continuing backward compatibility, this legacy functionality could stay enabled by default, but my preference would be to have it be disabled by default because to me it seems to violate the <|principle of least astonishment>.

Technically this would mean having ParentID present and equal to 0 if $parent is null in <|this area>, so for example, if the legacy functionality were disabled by config, the resulting functionality would become:

$parentID = $parent ? $parent-&gt;ID : 0;
$pages = SiteTree::get()-&gt;filter(array(
    'URLSegment' =&gt; $URL,
    'ParentID' =&gt; $parentID,

This would still allow for the intended functionality of facilitating dynamic lookups for changes to page URL's when /level1/level2/level3 becomes /newlevel1/newlevel2/level3 (and so on) but would break lookups attempting to find /level3 (i.e. <|this particular unit test> would no longer pass).

Hide attachment content
Patrick Nelson

Does anyone have a logical explanation for precisely why it was decided that the following URL: would always redirect to: (see OldPageRedirector::find_old_page)

Shouldn't that 404?!

Patrick Nelson

p.s. I've been desperately digging away at this because it doesn't make sense to me. In reality there's no logical connection between the page requested and the page returned.


likely for legacy support. hierarchy in SS2 was optional opt in, but tbh i agree it's not a friendly feature

Patrick Nelson

VERY unfriendly. here's why:

I had a page called /path/to/child for a very long time. However, I also had a page called /child but renamed to /new-child. This method is supposed to redirect to /new-child but because this recursive algo decides to just not define a parent at all, it just finds any page in the CMS that currently has child as it's URL segment and returns that.

Patrick Nelson

and thus instead of finding the old page like it should, it just redirects to the existing /path/to/child which is surprising.

Patrick Nelson

Not to mention that, frustratingly, SiteTree::get_by_link also follows a similarly as legacy/unpredictable behavior. 😭

Patrick Nelson

and, ironically, it performs that ->nested_urls check too.

  1. /// Fall back on a unique URLSegment for b/c.
  2. if( !$sitetree
  3. && self::config()->nested_urls
  4. && $page = DataObject::get_one('SiteTree', array(
  5. '"SiteTree"."URLSegment"' => $URLSegment
  6. ), $cache)
  7. ) {
  8. return $page;
  9. }
Patrick Nelson

and that's a huge pain because unfortunately with it being statically defined and with limitations of how SS implements it's IoC (Injector), you end up with an impossible to override core functionality.

Show less replies
Patrick Nelson

seems like in SS 4.0 $app->addMiddleware(...) would be the "route" to go (no pun intended)

Patrick Nelson

Finally, is there actually a way to override the URL that SilverStripe sees? e.g. I have a multi-domain system that I'm working on and I can use ?url= to fool SS as needed in order to facilitate URL's with a duplicate apparent request URL by just prefixing the subdomain in the URL.

Patrick Nelson

Should that just be:

  1. # Process through SilverStripe if no file with the requested name exists.
  2. RewriteCond %{REQUEST_FILENAME} !-f
  3. RewriteRule .* index.php


Patrick Nelson

reason: Is this legacy? Actually read the comments and then the code.

  1. # Process through SilverStripe if no file with the requested name exists.
  2. # Pass through the original path as a query parameter, and retain the existing parameters.
  3. # Try finding framework in the vendor folder first
  4. RewriteCond %{REQUEST_URI} ^(.*)$
  5. RewriteCond %{REQUEST_FILENAME} !-f
  6. RewriteRule .* index.php
  1. Importantly, it's no longer using the ?url= query string parameter (my original reason for researching this for future proofing some related work around this on my end) and the comment there is not accurate anymore, is it? There's no param and no longer a [QSA].
  2. It's not referencing the framework, so that comment is also pointless
  3. Why waste a line capturing from REQUEST_URI when ... it's not used?