The Excel File Format
Pages in this article
Editing elements in an OpenXML file using VBA
Wouldn't it be useful to be able to edit the contents of an Office OpenXML file from within VBA? Well, now you can.
Download
I have made the file used in this article available for download:
Doug Glancy, a colleague of mine, has developed a tool to edit the RibbonX code in your file without ever leaving Excel.
Content of the VBA project
The VBA project contains two modules and one class module:
Content of the VBA project as shown in the project explorer in the VBE
modDemo contains the code that demonstrates the use of the class module clsEditOpenXML. modUNC contains code to work with UNC paths.
The project also uses the Microsoft XML library to ease working with the XML, as shown in the following screenshot of the references:
References of the VBAProject
Also, a couple of other libraries are put to use: the File Scripting Object and the Windows Shell application object, both through late binding as shown later on.
Class module to work with OpenXML files
I have derived a class module that is able to perform the following tasks:
- Unzip an .xlsx or .xlsm file
- Extract any XML file from the folderstructure
- Write back any XML file back into the folder structure
- Zip the file back.
Unzip an .xlsx or .xlsm file
In order to be able to work with the files contained within the OpenXML zipped file structure, the first step that is needed is to unzip the content of the file.
For safety reasons a backup copy of the file is made first. Then the file is renamed by appending .zip after the name. Next, the .zip file is unzipped to a folder.
The code below (taken from class module clsEditOpenXML) shows how this is done:
'-------------------------------------------------------------------------
' Procedure : UnzipFile
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse
' Created : 6-5-2009
' Purpose : Unzips all files in a zip file to a designated folder
'-------------------------------------------------------------------------
'Courtesy www.rondebruin.nl
Dim FSO As Object
Dim oShellApp As Object
Set FSO = CreateObject("scripting.filesystemobject")
'Derive the folder to unzip to from the location of the sourcefile
UnzipFolder = FolderName
'A dedicated unzip folder will be created in the same folder as the sourcefile,
'called ..\Unzipped Filename\
If Right(UnzipFolder, 1) <> "\" Then
UnzipFolder = UnzipFolder & "\UnZipped " & FileName & "\"
Else
UnzipFolder = UnzipFolder & "UnZipped " & FileName & "\"
End If
On Error Resume Next
'Remove all previous existing folders
FSO.deletefolder UnzipFolder & "*", True
Kill UnzipFolder & "*.*"
On Error GoTo 0
'Create normal folder
If FolderExists(UnzipFolder) = False Then
MkDir UnzipFolder
End If
Set oShellApp = CreateObject("Shell.Application")
'Copy the files in the newly created folder
oShellApp.Namespace(UnzipFolder).CopyHere oShellApp.Namespace(SourceFile).items
On Error Resume Next
'Clean up temp folder
FSO.deletefolder Environ("Temp") & "\Temporary Directory*", True
'Inside the now unzipped folder structure all relevant files are
'located here:
XLFolder = UnzipFolder & "xl\"
Set oShellApp = Nothing
Set FSO = Nothing
Exit Sub
End Sub
Extract any XML file from the folderstructure
This little routine extracts an xml file from the unzipped folders and returns the XML contained in that file:
'-------------------------------------------------------------------------
' Procedure : GetXMLFromFile
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse
' Created : 6-5-2009
' Purpose : Gets the XML code from the foldername\filename
'-------------------------------------------------------------------------
Dim oXMLDoc As MSXML2.DOMDocument
If Len(XLFolder) = 0 Then
GetXMLFromFile = ""
Else
Set oXMLDoc = New MSXML2.DOMDocument
oXMLDoc.Load XLFolder & sFileName
GetXMLFromFile = oXMLDoc.XML
Set oXMLDoc = Nothing
End If
End Function
Write back any XML file back into the folder structure
The opposite direction is equally straightforward:
'-------------------------------------------------------------------------
' Procedure : WriteXML2File
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse
' Created : 6-5-2009
' Purpose : Writes sXML to sFileName
'-------------------------------------------------------------------------
Dim oXMLDoc As MSXML2.DOMDocument
Set oXMLDoc = New MSXML2.DOMDocument
oXMLDoc.loadXML sXML
oXMLDoc.Save XLFolder & sFileName
End Sub
Zip the file back.
After we're done editing the xml contents of the unzipped OpenXML package we need to rezip the folders again. The code below does exactly that:
'-------------------------------------------------------------------------
' Procedure : ZipAllFilesInFolder
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse
' Created : 6-5-2009
' Purpose : Zips all files in a folder (including subfolders) whilst retaining the folder structure
'-------------------------------------------------------------------------
'Courtesy www.rondebruin.nl
Dim oShellApp As Object
Dim sDate As String
Dim sDefPath As String
Dim vFileNameZip As Variant
Dim FSO As Object
Dim lFileCt As Long
Set FSO = CreateObject("scripting.filesystemobject")
'To ensure a unique filename,
'append date and time to the name of the current file
sDate = Format(Now, " dd-mmm-yy h-mm-ss")
vFileNameZip = SourceFile & sDate & ".zip"
'Create empty Zip File
NewZip vFileNameZip
Set oShellApp = CreateObject("Shell.Application")
'Count how many items are in the "old" folder
lFileCt = oShellApp.Namespace(FolderName & "Unzipped " & FileName & "\").items.Count
'Copy the files to the compressed folder
oShellApp.Namespace(vFileNameZip).CopyHere oShellApp.Namespace(FolderName & "Unzipped " & FileName & "\").items
'Keep script waiting until we have same # of files in the new folder
On Error Resume Next
Do Until oShellApp.Namespace(vFileNameZip).items.Count = lFileCt
Application.Wait (Now + TimeValue("0:00:01"))
Loop
DoEvents
'Remove original file
Kill SourceFile
'Rename new zipped file to same name as original file (with .zip appended)
Name vFileNameZip As SourceFile
On Error Resume Next
'Now remove old folder, just in case something went haywire
FSO.deletefolder FolderName & "Unzipped " & FileName, True
On Error GoTo 0
Set oShellApp = Nothing
End Sub
Getting the xml file belonging to a specific sheet
One of the most basic operations when working with the OpenXML package would be changing the XML of a worksheet in the file. Of course we'd want to extract the proper XML from the package based on the information mere mortals know: the sheet's name. The code shown below extracts the proper rId (relationship identifier) attribute value as it is listed within the file workbook.xml within the sheets node of that file.
'-------------------------------------------------------------------------
' Procedure : GetSheetIdFromSheetName
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse
' Created : 6-5-2009
' Purpose : Finds out what the SheetId of sSheetname is
' by reading Workbook.xml
'-------------------------------------------------------------------------
Dim oXMLDoc As MSXML2.DOMDocument
Dim oXMLNode As MSXML2.IXMLDOMNode
Dim oXMLNodeList As MSXML2.IXMLDOMNodeList
If mvXLFolder <> "" And Sheet2Change <> "" Then
Set oXMLDoc = New MSXML2.DOMDocument
oXMLDoc.Load XLFolder & "workbook.xml"
Set oXMLNodeList = oXMLDoc.SelectNodes("/workbook/sheets/sheet")
For Each oXMLNode In oXMLNodeList
If oXMLNode.Attributes.getNamedItem("name").nodeValue = sSheetName Then
GetSheetIdFromSheetName = oXMLNode.Attributes.getNamedItem("r:id").nodeValue
Exit Function
End If
Next
End If
End Function
The next routine then finds out which xml file belongs to that Id:
'-------------------------------------------------------------------------
' Procedure : GetSheetFileNameFromId
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse
' Created : 6-5-2009
' Purpose : Fetches the name of the xml file belonging to the sheet with id SheetId.
'-------------------------------------------------------------------------
Dim oXMLDoc As MSXML2.DOMDocument
Dim oXMLNode As MSXML2.IXMLDOMNode
Dim oXMLNodeList As MSXML2.IXMLDOMNodeList
If mvXLFolder <> "" And Sheet2Change <> "" Then
Set oXMLDoc = New MSXML2.DOMDocument
oXMLDoc.Load XLFolder & "_rels\workbook.xml.rels"
Set oXMLNodeList = oXMLDoc.SelectNodes("/Relationships/Relationship")
For Each oXMLNode In oXMLNodeList
If oXMLNode.Attributes.getNamedItem("Id").nodeValue = sSheetId Then
GetSheetFileNameFromId = oXMLNode.Attributes.getNamedItem("Target").nodeValue
Exit Function
End If
Next
End If
End Function
Of course it might be useful to be able to go the other way: Get a sheetname belonging to a specific Id:
'-------------------------------------------------------------------------
' Procedure : GetSheetNameFromId
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse
' Created : 6-5-2009
' Purpose : Returns the sheetname belonging to a sheetId
'-------------------------------------------------------------------------
Dim oXMLDoc As MSXML2.DOMDocument
Dim oXMLNode As MSXML2.IXMLDOMNode
Dim oXMLNodeList As MSXML2.IXMLDOMNodeList
If mvXLFolder <> "" Then
Set oXMLDoc = New MSXML2.DOMDocument
oXMLDoc.Load XLFolder & "workbook.xml"
Set oXMLNodeList = oXMLDoc.SelectNodes("/workbook/sheets/sheet")
For Each oXMLNode In oXMLNodeList
If oXMLNode.Attributes.getNamedItem("r:id").nodeValue = "rId" & Val(sId) + 1 Then
GetSheetNameFromId = oXMLNode.Attributes.getNamedItem("name").nodeValue
'Got it, get out
Exit Function
End If
Next
End If
End Function
How to use the class module
In a normal module (called modDemo) I have demonstrated how the class module may be put to use:
'-------------------------------------------------------------------------
' Procedure : Demo
' Company : JKP Application Development Services (c)
' Author : Jan Karel Pieterse (jkp-ads.com)
' Created : 06-05-2009
' Purpose : Demonstrates getting something from an OpemXML file
'-------------------------------------------------------------------------
Dim cEditOpenXML As clsEditOpenXML
Dim sXML As String
Set cEditOpenXML = New clsEditOpenXML
With cEditOpenXML
'Tell it which OpenXML file to process
.SourceFile = ThisWorkbook.Path & "\formcontrols.xlsm"
'Before you can access info in the file, it must be unzipped
.UnzipFile
'Tell it which sheet you want to change
.Sheet2Change = "MySheet"
'Get XML from the sheet's xml file
sXML = .GetXMLFromFile(.SheetFileName)
'Change the xml of the sheet here
'Now write the xml back to the sheet:
'.WriteXML2File sXML, .SheetFileName
'Now rezip the unzipped package
.ZipAllFilesInFolder
End With
'Only when you let the class go out of scope the zip file's .zip extension is removed
'in the terminate event of the class.
'Then the OpenXML file has its original filename back.
Set cEditOpenXML = Nothing
End Sub
Conclusion
The code shown in this article and in the associated download file shows you a way to extract content from the Office 2007 OpenXML file format and even enables you to edit parts of the content.
Of course it is up to you to keep everything nice and tidy and adhere to the rules of the OpenXML file format specification. See this article for more information about the format.
Comments
All comments about this page:
Comment by: Mike Alexander (7-5-2009 09:54:20) deeplink to this comment
Excellent article JK. Very nice.
Comment by: Jan Karel Pieterse (7-5-2009 10:49:06) deeplink to this comment
Thanks Mike!
Comment by: va sw (10-5-2009 08:57:11) deeplink to this comment
Thank you for a great article
Best Wishes
Comment by: JimC (27-2-2011 14:02:56) deeplink to this comment
Thank you for taking the time to publish this excellent article. It's just what I needed to get me started working with XML content and improve my coding techniques. I own you a beer!
Comment by: Pascal (12-6-2011 12:51:39) deeplink to this comment
Is it posible to add (when not exist) a node (button > macro) to the existing ribbon files in:
%userprofile%\Local Settings\Application Data\Microsoft\Office\olkmailitem.qat
Ik would like to add a button to the qat of Outlook.
Comment by: Jan Karel Pieterse (14-6-2011 00:20:46) deeplink to this comment
Hi Pascal,
Of course you can, but I am not sure whether you can do this when Outlook is open. Nor do I have code samples for you here, as this is an Excel-oriented site! Try googling for Outlook QAT and VBA, maybe something turns up?
Comment by: Globe (25-2-2012 08:25:18) deeplink to this comment
Jan, it's a fantastic article! It helped me to solve a huge problem of hidden corrupted styles there in Excel 2007 files (discussed here http://social.msdn.microsoft.com:80/Forums/en-US/exceldev/thread/446f388f-5eae-4799-ac8e-84330a6a5d4d)
Comment by: Dan (29-3-2012 09:04:27) deeplink to this comment
Hi Jan, this is fantastic, thank you. I've been trying to get my head around navigating XML with vba for a few days now. My question - is there any trick to modifying the XML file once the required node is located? I've in fact been attempting to do so with pres.xml in PowerPoint, though the principle is the same - any changes I make don't get retained / written back to the file. Have searched high and low and posted to ozgrid but no luck. A nudge in the right direction would be great.
Thanks for sharing.
Comment by: Jan Karel Pieterse (30-3-2012 01:58:15) deeplink to this comment
Hi Dan,
After changing the node's value and/or other attributes, you can get the modified XML from the DomDOcument object you used to load the XML into in the first place. Changing the props of a found node will automatically change the XML of the DomDocument.
Then look at the routine I placed below the "How to use the class module" heading above. It has some comments on how to write the XML file back into the package.
Comment by: Tomi S. (10-8-2013 19:08:59) deeplink to this comment
Can I use these functions to rebuild a complicated Excel file from scratch? File thousands of formulas, data tables, named ranges etc and it has been versioned for over 5 years in our company. I think the formatting saved with the file is the thing that is occasionally corrupting the file.
I'd need to unzip the calculation workbook, copy the sheets with formulas, datatables and named cells. Then create an empty workbook, unzip the new file, replace the sheet xml from the calculation workbook xml files, then zip the new file back. Would this work?
BR, Tomi
Comment by: Ralf W, GR (19-2-2014 09:51:05) deeplink to this comment
Thank you for this great demo
Best Wishes
Comment by: miguel chavez (17-12-2014 01:25:33) deeplink to this comment
Hi, Jan, Excellent Article
I have A question, Do you have some article that I can to change the connection string for access different database in sql server?
Example: I have database a,b,c. I want to do this, in xml and excel. Thank you
Comment by: Jan Karel Pieterse (17-12-2014 09:08:38) deeplink to this comment
Hi Miguel,
No, I have no sample code for that. But the information you need is in the file xl\connections.xml and the code the article shows already has some examples for opening and updating another xml file in the package, so you should be able to get this going I expect?
Comment by: Doug Glancy (21-9-2017 20:12:05) deeplink to this comment
Nice! I just came across this. I'm working in a pretty locked-down environment, so can't use the CustomUI editor any more. I use Andy Pope's RibbonX Visual Designer, an addin I can install, but this will allow me to modify the XML directly.
I've updated the API calls to 64-bit 2016 and will add a file open dialog. After that I think I'll add the ability to dump the XML to a text file and then pull it back.
Thanks once again for another great tool from jkp-ads!
Comment by: Jan Karel Pieterse (22-9-2017 15:49:44) deeplink to this comment
Hi Doug,
Glad I could be of help!
Have a question, comment or suggestion? Then please use this form.
If your question is not directly related to this web page, but rather a more general "How do I do this" Excel question, then I advise you to ask your question here: www.eileenslounge.com.