Migrating Drupal 7 Forums into an Adxstudio Portal in Microsoft Dynamics – Part 1

I’ve been meaning to write this post for a while. Making the move to DCRM can be time consuming and much consideration is needed on how to get your data into the system. If you’ve been looking around online, you’ve probably discovered that there are a few ways where large sets of data can be imported. I’ve looked at many resources that talk about how you can get data in, and I have to admit, some are super confusing and most only cover one tiny section. As useful as they were, you never get a complete picture.

In this three part post, I’ll go over how you can get your forum posts from Drupal 7, prepare it and get it imported into DCRM.

Getting the Forum Data from Drupal

Step 1 of course is the actual forum posts. I’m making the assumption that you are using the standard forum part of Drupal. If you’re not, the principal is the same. You can tweak the code to suit your environment. We’re going to start by getting the latest forum threads from database using an SQL query. This will include the user’s information, the forum post title, the content and the date the post was created.

SELECT n.nid AS nid, n.uid AS uid, u.mail AS 'Email_Address', n.title AS `Thread_Name`, b.body_value AS Content, n.created AS `Unix_Post_Date`
FROM node n, field_revision_body b, users u 
WHERE type="forum" AND n.nid = b.entity_id AND n.uid = u.uid;

Now, armed with your forum posts, you’ll want to store this into a separate table. Preferably, you would move this into a separate database so that we don’t interfere with the live site.

Note: I did mention that we’ll be covering the process from end to end. However, I’m assuming that if you’ve been tasked with this job, you at least know how to copy the output you got from the last query into another table. I’ve provided the Table Creation script in the next section to ease the pain. All you have to do is export/import your data.

The next set of data we need is the comments that belong to each of these posts. What is a forum without comments?

SELECT c.cid AS cid, c.nid AS nid, c.uid AS uid, u.mail AS 'Email_Address', c.created AS `Unix_Post_Date`, n.title AS `Thread_Name`, c.subject AS `Re_Thread_Name`, b.comment_body_value AS Content 
FROM `comment` c, `node` n, `field_data_comment_body` b, `users` u 
WHERE c.cid = b.entity_id AND c.nid = n.nid AND c.uid = u.uid;

Now, similar to the first output, let’s do the same with comments and store this in a separate table as well.

Storing the Threads and Posts

This is the SQL script you’ll need to create the two new tables used to store the data you obtained in the last section.

Creating the Threads Table

You’ll notice that we have four extra columns we’re creating called, “Last Post Date, First Post Id, Last Post Id, Post Count”. We’re going to need these in the following section.

CREATE TABLE `forum_threads` (
  `nid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `uid` int(11) DEFAULT NULL,
  `Email_Address` varchar(255) NOT NULL DEFAULT '',
  `Thread_Name` varchar(255) NOT NULL DEFAULT '',
  `Content` longtext NOT NULL,
  `Unix_Post_Date` int(11) DEFAULT NULL,
  `Last_Post_Date` datetime DEFAULT NULL,
  `First_Post_Id` varchar(255) DEFAULT NULL,
  `Last_Post_Id` varchar(255) DEFAULT NULL,
  `Post_Count` int(11) DEFAULT NULL,
  PRIMARY KEY (`nid`)
) ENGINE=InnoDB AUTO_INCREMENT=1012539 DEFAULT CHARSET=utf8;

Creating the Posts Table

Like the Threads table, we have two more columns we’re creating called, “Last Post Date, Unique String”.

CREATE TABLE `forum_posts` (
  `cid` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `nid` int(10) unsigned NOT NULL,
  `uid` int(11) DEFAULT NULL,
  `Email_Address` varchar(255) NOT NULL,
  `Unix_Post_Date` int(11) NOT NULL,
  `Last_Post_Date` datetime DEFAULT NULL,
  `Thread_Name` varchar(255) NOT NULL DEFAULT '',
  `Re_Thread_Name` varchar(255) NOT NULL,
  `Content` longtext,
  `unique_string` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`cid`)
) ENGINE=InnoDB AUTO_INCREMENT=1003445 DEFAULT CHARSET=utf8;

Preparing the Data

Of the ways that data can be imported into DCRM, we’re going to look at one of the simpler ways which is to use XML formatted as Excel XML documents. This should make the process easier and allows you to use the Import feature in the DCRM site and allow you to map your data to the end object.

Before we can continue ahead, we’ll need to fill in the new (NULL filled) columns we created earlier. We’ll start with the modifying the date of the post so that it is in a format that DCRM understands. You can use any language, but we’ll show you how to modify this using PHP.

Updating the Last Post Date in the Forum Threads table

<?php
$query = 'SELECT nid, Unix_Post_Date FROM forum_threads;';
$result = mysql_query($query);
while ($row = mysql_fetch_object($result)) {
	
	$updated_date = gmdate("Y-m-d\TH:i:s.000", $row->Unix_Post_Date);

	$query2 = 'UPDATE forum_threads SET Last_Post_Date="'.$updated_date.'" WHERE nid="'.$row->nid.'";';
	mysql_query($query2);
}
?>

Updating the Last Post Date in the Forum Posts table

<?php
$query = 'SELECT cid, Unix_Post_Date FROM forum_posts;';
$result = mysql_query($query);
while ($row = mysql_fetch_object($result)) {
	
	$updated_date = gmdate("Y-m-d\TH:i:s.000", $row->Unix_Post_Date);

	$query2 = 'UPDATE forum_posts SET Last_Post_Date="'.$updated_date.'" WHERE cid="'.$row->cid.'";';
	mysql_query($query2);
}
?>

Creating a Unique ID for each Forum Post

With Adxstudio, the forum comments works in an interesting way. This is something you probably wouldn’t have to do with Drupal since it checks what the last comment to a post is based on time. This needs to be specified in Adxstudio. We could probably get away with just using the Drupal “cid” field and check which is the last and assign it to a post. I went ahead and created a brand new unique string to attribute each post to so that there’s no confusions with other content within DCRM that may or may not have this same ID.

This next part can literally be done any way you like. We’re just creating a function (or three) to pass back a unique string.

<?php

function generateRandomString($length = 60) {
	$characters = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
	$charactersLength = strlen($characters);
	$randomString = '';
	for ($i = 0; $i < $length; $i++) {
		$randomString .= $characters[mt_rand(0, $charactersLength - 1)];
	}
	
	return $randomString;
}

function microtime_int() {
	list($usec, $sec) = explode(" ", microtime());
	return ((int)$usec + (int)$sec);
}

function createRandom() {
	
	$randomString = generateRandomString();
	$currentTime = date('yYzmMdDhis');
	$microTime = microtime_int();
	
	$newRandom = $microTime.$randomString.$currentTime;
	
	$shuffled = str_shuffle($newRandom);
	
	return $shuffled;
}

?>

With that out of the way, we’ll need to now update each forum post with a unique string.

<?php

$query = 'SELECT cid FROM forum_posts;';
$result = mysql_query($query);

while ($row = mysql_fetch_object($result)) {
	$unique_id = createRandom();
	
	$query2 = 'UPDATE forum_posts SET unique_string="'.$unique_id.'" WHERE cid="'.$row->cid.'";';
	mysql_query($query2);
}

?>

Updating the First Post ID, Last Post ID and Post Count for each Forum Thread

This is the last part to preparing our data and wraps up the thread table with the remaining data. We’ll be setting the “First Post ID” as the title of the thread, “Last Post ID” as the unique string of the last comment to the post, and the total comments for the post which includes the initial post. This means that if there are 11 comments, the total post count should be 12 as this includes the original thread’s post.

<?php

$query = 'SELECT nid, Thread_Name FROM forum_threads;';
$result = mysql_query($query);
while ($row = mysql_fetch_object($result)) {
	
	$query2 = 'SELECT unique_string FROM forum_posts WHERE nid="'.$row->nid.'" ORDER BY cid DESC;';
	$result2 = mysql_query($query2);
	$row2 = mysql_fetch_object($result2);
	
	for ($x = 0; $x < 1; $x++) {
		$unique_string = $row2->unique_string;
	}
	
	$total_rows = mysql_num_rows($result2);
	$total_rows++;
	
	$query3 = 'UPDATE forum_threads SET First_Post_Id="'.$row->Thread_Name.'", Last_Post_Id="'.$row2->unique_string.'", Post_Count="'.$total_rows.'" WHERE nid="'.$row->nid.'";';
	mysql_query($query3);
}

?>

Structure of the XML Import Scripts

In this section, we’re going to break up the data in both tables into 4 separate XML scripts.

XML Script 1 – Forum Threads

This is the first part that gets uploaded into DCRM. You’ll notice that no IDs are used. As such, the “Thread Name” needs to be unique. You’ll notice that we’re only listing the thread name and not the actual content yet.

<Table ss:ExpandedColumnCount="4" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="15">
<Row>
<Cell><Data ss:Type="String">Forum</Data></Cell>
<Cell><Data ss:Type="String">Last Post Date</Data></Cell>
<Cell><Data ss:Type="String">Name</Data></Cell>
<Cell><Data ss:Type="String">Type</Data></Cell>
</Row>
</Table>

XML Script 2 – Forum Posts

The initial content for a thread is a “Post” in DCRM. This is the second thing we import. One thing you’ll need to ensure is that you have all the author’s contact record already in DCRM. For the author field, use their email address as this is unique. For the Forum Thread field, this is referring to the “Name” field in the previous step. The “Name” field can be a unique ID as in the database.

<Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="15">
<Row>
<Cell><Data ss:Type="String">Author</Data></Cell>
<Cell><Data ss:Type="String">Date</Data></Cell>
<Cell><Data ss:Type="String">Forum Thread</Data></Cell>
<Cell><Data ss:Type="String">Name</Data></Cell>
<Cell><Data ss:Type="String">Content</Data></Cell>
</Row>
</Table>

Note: Since this is XML data being imported, you’ll want to use CDATA around the content fields.

<![CDATA[  ]]>

XML Script 3 – Forum Comments

This XML is exactly like the Posts. The “Name” field will also be a unique ID as in the database. You’ll need to reference these in a fourth script where you update the main thread with first and last ID.

<Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="15">
<Row>
<Cell><Data ss:Type="String">Author</Data></Cell>
<Cell><Data ss:Type="String">Date</Data></Cell>
<Cell><Data ss:Type="String">Forum Thread</Data></Cell>
<Cell><Data ss:Type="String">Name</Data></Cell>
<Cell><Data ss:Type="String">Content</Data></Cell>
</Row>
</Table>

XML Script 4 – Updating the Initial Forum Thread

For everything to work in harmony, your threads have to know what the ID of the first post and last post is, as well as the total number of posts. Once you have this information, the XML structure for this is below.

<Table ss:ExpandedColumnCount="5" ss:ExpandedRowCount="1" x:FullColumns="1" x:FullRows="1" ss:DefaultColumnWidth="65" ss:DefaultRowHeight="15">
<Row>
<Cell><Data ss:Type="String">Name</Data></Cell>
<Cell><Data ss:Type="String">Forum Thread</Data></Cell>
<Cell><Data ss:Type="String">First Post</Data></Cell>
<Cell><Data ss:Type="String">Last Post</Data></Cell>
<Cell><Data ss:Type="String">Post Count</Data></Cell>
</Row>
</Table>

In Part 2 of this post, I’ll cover how you can get your forum data into this XML format and get it imported into DCRM.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>