Saturday, 7 February 2015

Disqus comment migration to new domain or platform

We have recently migrated from Drupal to Blogger. We used and are still using Disqus as the commenting platform for our blog. Because the paths (URLs) for our articles have changed, we had to do a migration for the comments as well.

Disqus tools for migrating threads

Disqus offers several tools to help with comment migration based on domain or path changes:
  • Domain Migration Wizard: use when the only thing that changed is the domain (paths to articles remain the same).
  • Upload a URL map: provide a CSV file which maps old paths with new paths.
  • Redirect Crawler: if you’ve set up 301 redirects for all your paths, this crawler will automatically map old paths with new ones.

How we migrated Disqus threads

Though we had set 301 redirects for our old paths, not all articles were covered by this, so we couldn’t use the Redirect Crawler.
The redirects that you can set up in Blogger have a limitation: you cannot redirect to a different domain. We have a multilingual blog and because we are now using a sub-domain for the Portuguese blog, we cannot properly redirect the old Portuguese paths (www.broculos.net/pt/...) to the new sub-domain (pt.broculos.net/...), so the Redirect Crawler was out of question.
To migrate the threads, we built a CSV file with mappings from the old paths to the new paths, which was pretty straightforward:

http://www.broculos.net/en/article/as400-chapter-1-introduction,http://www.broculos.net/2007/10/as400-chapter-1-introduction.html
http://www.broculos.net/en/article/as400-chapter-2-commands,http://www.broculos.net/2007/10/as400-chapter-2-commands.html
http://www.broculos.net/en/article/as400-chapter-3-libraries,http://www.broculos.net/2007/10/as400-chapter-3-libraries.html
...

Missing threads and how Disqus works

After using the URL map tool we still had a lot of articles without comments and we couldn’t figure out why. At this point, it’s important to understand how Disqus identifies threads and relates them to pages.
Disqus uses the Disqus identifier, that you can pass to the Disqus script with a JavaScript configuration variable named disqus_identifier. This identifier is what determines the appropriate thread to load.

It’s not mandatory to use the Disqus identifier, but if you define it in a page, Disqus will use that to load the thread. If it isn’t defined, Disqus will use the page’s URL.
The migration is supposed to map the old path to the new ones, but we found that there were a lot of threads with an empty path and, because of that, they couldn’t be migrated using Disqus tools. The only way to identify them was with the Disqus identifier.
How did we find that the threads had an empty path? When we started to investigate why the migration had failed for some articles, we used the Disqus Export tool, which exports all threads and comments in an XML format.

By inspecting the XML file, we found out the threads that had missing paths and their respective identifier. What we then do is, for all the pages with missing comments, we manually define the identifier and pass it to the Disqus script.

Somewhere on your page (before the Disqus script is loaded) you can identify the thread in this way:
var disqus_identifier = "id";
We did this for all the pages with missing threads and the problem was solved.

0 comments:

Post a Comment

 
Sohoz-Tech