December 3, 2017

Drupal, WordPress, Mediawiki, Mail Archive, Gitlab – all in one web site

by Andrey Filippov and Oleg Dzhimiev

Multiple Subdomains for the Same Web Site

It is a common case when a company or organization uses multiple content management systems (CMS) and specialized web application to organize its web presence. We describe here how Elphel handles such CMS variety and provide the source code that can be customized for other similar sites.

We currently use the following CMS and web applications:

  • Drupal as a general purpose CMS for the main site
  • WordPress for the development blogs
  • Mediawiki for the wiki-based documentation.
  • Mailman (self hosted) and Mail Archive (external site) for the mailing list that is our main channel of the user technical support
  • Gitlab CE for the code and other version-controlled content. We used Github but switched to self-hosted Gitlab CE following FSF recommendations
  • Other customized versions of FLOSS web applications, such as OSTicket for support tickets and FrontAccounting for inventory and production
  • In-house developed free software web applications, such as x3dom-based 3D scene and map viewer, 3D mechanical assembly viewer available for the assemblies and mechanical components on the wiki, WebGL Panorama Viewer/Editor.

Figure 1. Elphel web site on a desktop screen

When users visit company web site they expect to have well organized and logically solid navigation to get the information they came for. There are multiple ways how to combine several applications – for example, it is possible to add wiki to WordPress or use it with Gitlab repositories. Of course it is theoretically possible to develop a web site where all the information in available in these applications will be properly cross-linked and stay current. It may also be possible to customize one of the CMS to handle all the needs, but that would require significant web development resources. For a small company following that path the required number of web developers may easily exceed the number of hardware, FPGA and software developers.

On the other hand the small company size should not be an excuse – why the web site visitors can not find the information they are looking for. For us it was a very clear signal when our long time users asked if we can email them mechanical drawings of the camera boards and components. For us that meant a bad web site usability – for years we have this information for some 900+ standard and machined parts available in our wiki: DXF and PDF for 2-d drawings, STEP files for 3d in our wiki: 393 series and previous 353 series. Camera mechanical assemblies can be navigated with the x3dom-based viewer.

Improving Navigation of the Composite Web Site

As we were not going to give up simultaneous use of multiple content management systems (we needed functionality of each of them) we were looking for the following improvements of information accessibility:

  • to have a common search over all subdomains always available (looking glass icon in the top right corner)
  • as we can not cross-link properly all the information, then at least we have to communicate the idea that there are multiple subdomains to our visitors.

Figure 2. Site search for ‘ERS’ (acronym for Electronic Rolling Shutter)

We did try to launch simultaneous search over the main (Drupal) site, wiki, blog (WordPress) and Mail Archive using site-specific requests, but that was available only on the main site, and for Mail Archive we could not add it as it is hosted on the external site. An obvious solution to have site-wide search always accessible, including external sites was to use iframe elements. Figure 2 shows results of simultaneous search for “ERS”, each linking to the corresponding subdomain tab.

Figure 3. Elphel web site on a small screen

Immitating Multiple Application Windows

As we have to use iframe elements for multi-site search we can use the framed layout to tell visitors that instead of a single web site with a consistent navigation there are in fact multiple subdomains, each having different structure and navigation. Instead of a single browser window or tab we are presenting the web site as a stack of individual windows on a desktop as shown on Figure 1. Most web sites do not use full width of the modern computer desktop and have useful content width of just 800-1100 pixels – that gives some room to show parts of the subdomain pages that are out of focus in our case. For the not as wide screens (or browser windows) such as on mobile devices, the page layout is different (Figure 3) – subdomains that are currently out of focus are shown as colored horizontal bars. Navigation is rather intuitive – mouse click on the background tab brings it to focus, following links that lead to the other subdomain (e.g. from the blog tab to wiki) brings the corresponding page to focus and shows linked page there, avoiding the case when there will be two wiki tabs and no blog ones. External links open in new browser tabs – that also helps to keep multi-site structure intact.

Each subdomain page title (Main, Blog, Wiki/Docs…) is a link to the content shown in the current iframe element. Following that link or using “Copy Link Location” from a context menu opens just that subdomain page alone or copies its address to use it as a bookmark or send over email or post in social media.

Redirection and Canonical URLs

Next step after selecting the overall page layout was to decide – how should we show the framed version to the visitors and what URLs to have as canonical for different pages? At that point we had multiple subdomains: https://www3.elphel.com for the main site (Drupal), https://wiki.elphel.com for the wiki (Mediawiki), https://blog.elphel.com for the blogs (WordPress). The default https://www.elphel.com domain was reserved for the top page to open individual subdomains in the iframe elements.

One way to deal with it was to permanently redirect https://www3.elphel.com/* to https://www.elphel.com/www3/*, https://wiki.elphel.com/* to https://www.elphel.com/wiki/*, and so on, and then to rely on https://www.elphel.com/index.php to process the request and open subdomains in respective iframe elements. That would always load all the stack of the pages and may be both intrusive and annoying – visitors may be already aware of the structure of the web site and want just the specific blog post or wiki page. And as iframes are routinely used for serving ads, their functionality may be disabled in the browser of a visitor.

We use the original URLs as canonical (e.g. https://wiki.elphel.com/*, not https://www.elphel.com/wiki/*), so search engines will return them as the results, and then redirect (with temporary 302) requests depending on the referrer. If referrer (HTTP_REFERER) is provided in the request and it does not belong to Elphel domain – open the framed version with focus on the tab with the requested URL. Otherwise, with no referrer (directly entered, received via email, disabled for privacy reasons) or when opening the link from other Elphel page (such as already framed version) the page is opened alone.

Below is the relevant portion of the Apache 2 .htaccess file for the https://blog.elphel.com (there are also WordPress-specific rules and enforcing SSL access (use of https instead of insecure http protocol). Lines 5..7 are to disable redirection for certain groups of addresses. Line 2 discloses the fact (by adding “Vary: Referer” to the headers) that response depends on the referrer page.

Such redirection is obviously impossible for external sites as Mail Archive and we did not use it for https://git.elphel.com – in both cases if visitors followed those links from the search engines results, they would not expect such pages to be parts of the well cross-linked company web site.

Implementation

Source code: iframed

The main frame is a single page with iframe elements. The iframes’ addresses are hard coded in the index.php.
To avoid, where possible, displaying the same content in multiple frames window.postMessage() method is used. Link click events in an iframe are blocked and the link data is passed to the main frame via postMessage for analysis following focusing on an iframe with a matching domain and refreshing the links or opening a new tab for external websites.

The following code was added to each controlled subdomain (see elphel_messenger.js):

Usability ans SEO

We are trying to improve usability of the company web site that effectively is a collection of subdomains each running different CMS without major investments into the web design. At the same time we were trying to avoid mistakes in SEO (were we do not have much experience) that would make Elphel invisible for the search engines. Our SEO concerns were caused by the following reasons:

  • the top (almost like a “frameset” in older HTML days) page itself has very little content – most is provided in the included iframe elements
  • the served content depends on the referrer address – that might be considered as “cloaking.”

Knowing nothing how algorithms are distinguishing between cloaking and legitimate referrer redirects (such as visitors geographical location) we assumed that cloaking shows indexing bots some information that is not served to the human visitors. Our case is different – visitors get either exactly the same information as the bots, or the same one plus additional. We indicated that the served content depends on the referrer URL in the response header (so smart bot can try variable values for “referer”). Testing the new web site for half year did not show any negative effect on the web site visibility in the search engines.

But the main question still remains – did we reach our goal to improve web site usability?

  • Were we able to communicate the idea that the site consists of multiple loosely-connected subdomains with different navigation?
  • Is the framed site navigation intuitive or annoying?
  • Does the combined search over multiple subdomains do its job and does it behave as users expect?

Only our visitors can tell us this, user comments are always welcome. And while we do not have a one-click distribution to use this solution for other similar sites (some customization will be needed) our code is available in the Git repository iframed.


Leave a Reply

Your email address will not be published. Required fields are marked *


+ five = 11