DevPinoy.org
A Filipino Developers Community
   
[Updated] How To: Merge Multiple Microsoft Word Documents

"Combine. Merge. Join. Append. Concatenate. Microsoft Word Documents."

A few weeks ago I wrote an article about merging word documents in C# and got great response about the article. One of the readers, Abhi, had an interesting problem. The application he wrote was throwing this error:

Word was unable to read this document. It may be corrupt. Try one or more of the following: * Open and Repair the file. * Open the file with the Text Recovery converter. 

Upon further inspection, I realized that the issue was being raised by this line:

// Create a new file based on our template
Word._Document wordDocument = wordApplication.Documents.Add(
                                   ref defaultTemplate
                                 , ref missing
                                 , ref missing
                                 , ref missing);

What that line does is it tries to locate the default Microsoft Word template(Normal.dot) and use it as the base template for the new Word document being created. I did a Filemon on the application and found that I wasn't able to locate the Normal.dot file when the class is being used in a web application. The way to get around this problem is by assigning the path to the Normal.dot file inside the application via the web config. With that being said, I modified my first class and implemented the solution.

In your web.config, please add this key inside your appSettings section:

<appSettings> <add key="KeithRull.Utilities.OfficeInterop.DefaultWordTemplate" value="-change this to your template location(e.g c:\normal.dot)-"/>

</appSettings>

Next, Update your MsWord.cs. I've added several bug fixes to it(including the nasty page break at the end of each document). Below is the C# version of our class

using System;
using Word = Microsoft.Office.Interop.Word;
using System.Configuration;

namespace KeithRull.Utilities.OfficeInterop
{
    public class MsWord
    {
        /// <summary>
        /// This is the default Word Document Template file. I suggest that you point this to the location
        /// of your Ms Office Normal.dot file which is usually located in your Ms Office Templates folder.
        /// If it does not exist, what you could do is create an empty word document and save it as Normal.dot.
        /// </summary>
        private static string defaultWordDocumentTemplate = ConfigurationManager.AppSettings["KeithRull.Utilities.OfficeInterop.DefaultWordTemplate"].ToString();

        /// <summary>
        /// A function that merges Microsoft Word Documents that uses the default template
        /// </summary>
        /// <param name="filesToMerge">An array of files that we want to merge</param>
        /// <param name="outputFilename">The filename of the merged document</param>
        /// <param name="insertPageBreaks">Set to true if you want to have page breaks inserted after each document</param>
        public static void Merge(string[] filesToMerge, string outputFilename, bool insertPageBreaks)
        {
            Merge(filesToMerge, outputFilename, insertPageBreaks, defaultWordDocumentTemplate);
        }

        /// <summary>
        /// A function that merges Microsoft Word Documents that uses a template specified by the user
        /// </summary>
        /// <param name="filesToMerge">An array of files that we want to merge</param>
        /// <param name="outputFilename">The filename of the merged document</param>
        /// <param name="insertPageBreaks">Set to true if you want to have page breaks inserted after each document</param>
        /// <param name="documentTemplate">The word document you want to use to serve as the template</param>
        public static void Merge(string[] filesToMerge, string outputFilename, bool insertPageBreaks, string documentTemplate)
        {
            object defaultTemplate = documentTemplate;
            object missing = System.Type.Missing;
            object pageBreak = Word.WdBreakType.wdPageBreak;
            object outputFile = outputFilename;

            // Create  a new Word application
            Word._Application wordApplication = new Word.Application();

            try
            {
                // Create a new file based on our template
                Word._Document wordDocument = wordApplication.Documents.Add(
                                              ref defaultTemplate
                                            , ref missing
                                            , ref missing
                                            , ref missing);

                // Make a Word selection object.
                Word.Selection selection = wordApplication.Selection;

                //Count the number of documents to insert;
                int documentCount = filesToMerge.Length;

                //A counter that signals that we shoudn't insert a page break at the end of document.
                int breakStop = 0;

                // Loop thru each of the Word documents
                foreach (string file in filesToMerge)
                {
                    breakStop++;
                    // Insert the files to our template
                    selection.InsertFile(
                                                file
                                            , ref missing
                                            , ref missing
                                            , ref missing
                                            , ref missing);

                    //Do we want page breaks added after each documents?
                    if (insertPageBreaks && breakStop != documentCount)
                    {
                        selection.InsertBreak(ref pageBreak);
                    }
                }

                // Save the document to it's output file.
                wordDocument.SaveAs(
                                ref outputFile
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing
                            , ref missing);

                // Clean up!
                wordDocument = null;
            }
            catch (Exception ex)
            {
                //I didn't include a default error handler so i'm just throwing the error
                throw ex;
            }
            finally
            {
                // Finally, Close our Word application
                wordApplication.Quit(ref missing, ref missing, ref missing);
            }
        }
    }
}

A VB.NET version of the class was also requested so I decided to add it to this article, a notable difference between the C# and the VB.NET version is that the VB.NET version doesn't have ref missing all over the place. This is because VB.NET support optional parameters and C# does not. Below is the VB.NET version:

Imports System
Imports Word = Microsoft.Office.Interop.Word
Imports System.Configuration

Namespace KeithRull.Utilities.OfficeInterop
    Public Class MsWord
        ''' <summary>
        ''' This is the default Word Document Template file. I suggest that you point this to the location
        ''' of your Ms Office Normal.dot file which is usually located in your Ms Office Templates folder.
        ''' If it does not exist, what you could do is create an empty word document and save it as Normal.dot.
        ''' </summary>
        Private Shared defaultWordDocumentTemplate As String = ConfigurationManager.AppSettings("KeithRull.Utilities.OfficeInterop.DefaultWordTemplate").ToString()

        ''' <summary>
        ''' A function that merges Microsoft Word Documents that uses the default template
        ''' </summary>
        ''' <param name="filesToMerge">An array of files that we want to merge</param>
        ''' <param name="outputFilename">The filename of the merged document</param>
        ''' <param name="insertPageBreaks">Set to true if you want to have page breaks inserted after each document</param>
        Public Shared Sub Merge(ByVal filesToMerge As String(), ByVal outputFilename As String, ByVal insertPageBreaks As Boolean)
            Merge(filesToMerge, outputFilename, insertPageBreaks, defaultWordDocumentTemplate)
        End Sub

        ''' <summary>
        ''' A function that merges Microsoft Word Documents that uses a template specified by the user
        ''' </summary>
        ''' <param name="filesToMerge">An array of files that we want to merge</param>
        ''' <param name="outputFilename">The filename of the merged document</param>
        ''' <param name="insertPageBreaks">Set to true if you want to have page breaks inserted after each document</param>
        ''' <param name="documentTemplate">The word document you want to use to serve as the template</param>
        Public Shared Sub Merge(ByVal filesToMerge As String(), ByVal outputFilename As String, ByVal insertPageBreaks As Boolean, ByVal documentTemplate As String)
            Dim defaultTemplate As Object = documentTemplate
            Dim pageBreak As Object = Word.WdBreakType.wdPageBreak
            Dim outputFile As Object = outputFilename

            ' Create a new Word application
            Dim wordApplication As Word._Application = New Word.Application()

            Try
                ' Create a new file based on our template
                Dim wordDocument As Word._Document = wordApplication.Documents.Add(defaultTemplate)

                ' Make a Word selection object.
                Dim selection As Word.Selection = wordApplication.Selection

                'Count the number of documents to insert;
                Dim documentCount As Integer = filesToMerge.Length

                'A counter that signals that we shoudn't insert a page break at the end of document.
                Dim breakStop As Integer = 0

                ' Loop thru each of the Word documents
                For Each file As String In filesToMerge
                    breakStop += 1
                    ' Insert the files to our template
                    selection.InsertFile(file)

                    'Do we want page breaks added after each documents?
                    If insertPageBreaks AndAlso breakStop <> documentCount Then
                        selection.InsertBreak(pageBreak)
                    End If
                Next

                ' Save the document to it's output file.
                wordDocument.SaveAs(outputFile)

                ' Clean up!
                wordDocument = Nothing
            Catch ex As Exception
                'I didn't include a default error handler so i'm just throwing the error
                Throw ex
            Finally
                ' Finally, Close our Word application
                wordApplication.Quit()
            End Try
        End Sub
    End Class
End Namespace

To use this class, all you need to do is add the necessary references(Microsoft.Office.Core & Microsoft.Office.Interop.Word) and include my MsWord class in your project. A sample method call is listed below.

protected void mergeDocumentsButton_Click(object sender, EventArgs e)
{
   try
   {
      string document1 = document1FileUpload.PostedFile.FileName;
      string document2 = document2FileUpload.PostedFile.FileName;

      string[] documentsToMerge = { document1, document2 };

      string outputFileName = String.Format("d:\\{0}.doc", Guid.NewGuid());

      MsWord.Merge(documentsToMerge, outputFileName, true);

      messageLabel.Text = outputFileName;
   }
   catch (Exception ex)
   {
      messageLabel.Text = ex.Message;
   }
}

You can download the sample project here including the VB.NET and C# class.

KeithRull.MergeWordDocuments.WebApp.zip (14.68 KB)

*Note: you might need to update the references of the project since I only have Office 2007 installed in my machine and the references that I used was the Office 2007 Primary Interop Assemblies.


Posted 06-09-2007 10:13 AM by keithrull

Comments

sandy wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 06-11-2007 12:30 PM

Hi...i have a small doubt regarding the Merging section. After the word documents merge, how can we assign the name we wanted to the newly created word document?

Thank you so much...

keithrull wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 06-11-2007 12:52 PM

There is a parameter in the function called "outputFilename" which you should assign together with the path. e.g c:\documents\generated.doc

Tom wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 06-19-2007 11:20 AM

I have a similar function, but there is a problem with it so I decided to try yours.  Unfortunately, we both have the same problem.  The problem I am seeing is that the footer from the first page is repeated on every page, even though all the documents I inserted have different text in the footer.  Any ideas?

monicker wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 07-04-2007 10:51 PM

Monkey Merge - Automatically merge multiple Microsoft Word documents, Excel spreadsheets, Adobe PDF files and all plain text files such as CSV files.

more information:

www.qweas.com/.../monkey_merge.htm

Emkcah wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 10-16-2007 6:46 PM

Have you tried to run this code in Word 2007 + Vista OS but there is quite diffrerent scenario that the Normal.dotm is corrupted?

Mike Maholchic wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 10-22-2007 12:08 PM

I spent three days trying to get this to work with the range object (did not work) and cut and paste with the selection object (inserted extra page breaks in the target merge document) so thanks, this worked for me.

Mike Maholchic wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 10-22-2007 12:08 PM

I spent three days trying to get this to work with the range object (did not work) and cut and paste with the selection object (inserted extra page breaks in the target merge document) so thanks, this worked for me.

Brano wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 01-17-2008 7:53 AM

Hi, I have a two files I am trying to merge one is Landscape and other one is portrait when I try to merge them they are all portrait. Also there is an image in the first one and when they are merged the image is gone... Any ideas?

Anirban wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 04-17-2008 9:21 AM

Brilliant work! Very useful!

I am wondering about one thing:

Suppose, merging of word documents is one amongst the million features that my application has. If I do not want to tie down my application on the dependecy of Word Dlls, then what is my way out? I mean, what if my application is running on a machine that doesn't have Office installed or a version lower than 2003. It is ok if the application doesn't merge the documents in that case and simply provide a user message that word is not available for the merging. is there a way I can check the presence of these DLLs before I proceed to use them?

Arviy wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 07-28-2008 3:17 AM

Many Thanks!

Since it takes a long time to merge large amount of files is there a way we can be notified when the process is finished?

Deepak Tyagi wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 09-07-2008 11:47 PM

Merging the files with above method is good but how can we update the page numbers?

Deepak Tyagi wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 09-08-2008 5:45 AM

I have used the above method to merge the files but i have a problem. The page numbers are not added. How can we add/update the page numbers?

Please reply.

Sumit wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 09-15-2008 11:56 PM

Hi keithrull,

Its a Great Method. Thanks for posting nice article.

It works for me. Many Many Thanks.

Sumit

zoi wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 10-08-2008 3:31 AM

Thank you so much XD

Shruti Mishra wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 10-10-2008 6:29 AM

This is mind blowing code, it saved my 2 days of hardwork

Thanks a ton

Sylvain wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 10-17-2008 7:45 AM

Hi !

Thanks for this article and this code ! It helps me a lot.

However, I have a question :

How can you merge documents with different page set up ? I mean, for exemple, if you have a document with landscape, and another in portrait ? Your code doesn't work for me, unless I use the Selection.PageSetup objetc to set up the page before inserting the file. So I must open each file one time to register each PageSetUp.

But I think it's not a good way to do so...

sonal wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 03-23-2009 9:41 AM

hello keithrull's its really nice artical ...please help me..i m getting error...Error 1 The type or namespace name 'office' does not exist in the namespace 'Microsoft' (are you missing an assembly reference?)

I searched about this on net but didnt get any solution...

Thanks

yayaxoxo wrote re: [Updated] How To: Merge Multiple Microsoft Word Documents
on 08-13-2009 4:58 AM

Dear Keith, It is great artical to combine documents. good work.

I would like to ask you about change the page orientation.

How can I change the selected pages orientation while adding file to word. 1st word document's page will be landscape 2nd word document's page will be portrait and 3rd landscape 4th ... etc.

foreach (string file in filesToMerge)

               {

                   i = i + 1;

                   selection.InsertBreak(ref pageBreak);

                   if (i % 2 == 0)

                       selection.PageSetup.Orientation = Microsoft.Office.Interop.Word.WdOrientation.wdOrientPortrait;

                   else

                       selection.PageSetup.Orientation = Microsoft.Office.Interop.Word.WdOrientation.wdOrientLandscape;

                   breakStop++;

                   // Insert the files to our template

                   selection.InsertFile(

                                               file

                                           , ref missing

                                           , ref missing

                                           , ref missing

                                           , ref missing);

                   //Do we want page breaks added after each documents?

                   if (insertPageBreaks && breakStop != documentCount)

                   {

                       selection.InsertBreak(ref pageBreak);

                   }

               }

TrackBack wrote
on 10-07-2010 5:41 AM

. . . . . . .

Copyright DevPinoy 2005-2008