Problem
I have code that converts each sheet of a spreadsheet into a .txt file. It works well, however, given the big number of exports (about 90 .txt files), I’d like to seek advice on how to speed it up.
Sub xlsxTotxt()
Dim i As Integer
Dim directory As String
Dim fname As String
Dim xWs As Worksheet
Dim xTextFile As String
Dim rdate As String
directory = ThisWorkbook.Sheets("Macro").Range("D576").Value
rdate = ThisWorkbook.Sheets("Macro").Range("E47").Value
i = 0
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Do While ThisWorkbook.Sheets("Macro").Range("D577").Offset(i).Value <> ""
fname = Sheets("Macro").Range("D577").Offset(i).Value
Workbooks.Open (directory & fname)
For Each xWs In Workbooks(fname).Worksheets
xWs.Copy
xTextFile = directory & rdate & " - " & xWs.name & ".txt"
Application.ActiveWorkbook.SaveAs filename:=xTextFile, FileFormat:=xlText
Application.ActiveWorkbook.Saved = True
Application.ActiveWorkbook.Close
Next
Workbooks(fname).Close
i = i + 1
Loop
Application.ScreenUpdating = True
Application.DisplayAlerts = True
End Sub
Solution
Your code is incredibly fragile
directory = ThisWorkbook.Sheets("Macro").Range("D576").Value
rdate = ThisWorkbook.Sheets("Macro").Range("E47").Value
If a single row/column gets added or deleted or moved, those ranges are going to move and your code is going to fail completely.
If you can, those values should be in a dedicated sheet, not buried below 500 lines of other stuff. If not, name your ranges.
So, let’s say you take cell D576
and name it Directory_Path
or something similar. Now, rather than
directory = ThisWorkbook.Sheets("Macro").Range("D576").Value
which is incredibly fragile, you can use
directory = ThisWorkbook.Range("Directory_Path").Value
and so long as nobody actually deletes said row/column, that will always point to the right place.
Use With
Rather than re-referencing the same object over and over, you can use a With
statement to hold a reference. Like so:
Application.ActiveWorkbook.SaveAs Filename:=xTextFile, FileFormat:=xlText
Application.ActiveWorkbook.Saved = True
Application.ActiveWorkbook.Close
Becomes
With Application.ActiveWorkbook
.SaveAs Filename:=xTextFile, FileFormat:=xlText
.Saved = True
.Close
End With
much clearer and easier to read.
Handle things in the right order
Specifically your Application.[Settings]
. Anything *meta* like that should go right at the start and right at the end of the method(s) it applies to. This allows you to confirm, at a glance, what the internal state of your method is and check that things are set/reset correctly. You should also disable Application.EnableEvents
and Application.Calculation
for significant extra speed. Like so:
Sub xlsxTotxt()
With Application
.ScreenUpdating = False
.DisplayAlerts = False
.EnableEvents = False
.Calculation = xlCalculationManual
End With
...
Code
...
With Application
.ScreenUpdating = True
.DisplayAlerts = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
End With
End Sub
Use descriptive naming
Code should be written for other people (including future you) to read. Names should be descriptive, unambiguous and concise. In that order. I highly recommend the excellent, classic article on naming by Joel Spolsky. But in short, things should sound like what they are.
xTextFile
. What on earth is that? Looking at it I have no idea. If I had to guess, I’d say it’s some kind of file
object. Oh, it’s the filename
you want to save
your text file under? Why not call it newFilename
? Since it’s a full filename (including directory path). Maybe newFullfilename
or just fullFilename
might be even more descriptive.
xWs
suffers from the same problem. I can guess it’s a worksheet
object but beyond that? no idea. Since you just use it to iterate over the sheets in your workbook, maybe just call it currentSheet
?
fname
. Same thing. filename
. Since it’s the filename for the workbook you’re opening, how about targetWorkbookFilename
? Sure, it’s long, but screen real estate is cheap, cognitive processing is not and that name’s an awful lot easier to understand and work with.
Also, move your declarations near to where they’re used. This helps you *see* the different parts of your method and helps you keep track of where you are, refer back to your declarations and see ways to split things up into logical sub-methods. Personally, I prefer to keep declarations outside of Loop structures, but that’s personal preference.
Objects are your friend
VBA has a huge, extensive, comprehensive object model for every Office Application. Use it.
ThisWorkbook.Sheets("Macro")
You refer to that sheet multiple times. Rather than continually repeating that reference, put it in a worksheet
object and then just refer to the object.
Dim macroSheet As Worksheet
Set macroSheet = ThisWorkbook.Sheets("Macro")
Dim dateString As String
dateString = macroSheet.Range("E47").Value
Same with Workbooks
.
Do While ThisWorkbook.Sheets("Macro").Range("D577").Offset(i).Value <> ""
fname = Sheets("Macro").Range("D577").Offset(i).Value
Workbooks.Open (directory & fname)
For Each xWs In Workbooks(fname).Worksheets
xWs.Copy
xTextFile = directory & rdate & " - " & xWs.name & ".txt"
Application.ActiveWorkbook.SaveAs filename:=xTextFile, FileFormat:=xlText
Application.ActiveWorkbook.Saved = True
Application.ActiveWorkbook.Close
Relying on ActiveWorkbook
being the one you want is very fragile. Just make it a proper object and then your references will always be accurate:
Dim targetWorkbook As Workbook
Set targetWorkbook = Workbooks.Open (directory & fname)
...
For Each currentSheet In targetWorkbook.Sheets()
...
With targetWorkbook
.SaveAs filename:=xTextFile, FileFormat:=xlText
.Saved = True
.Close
End With
Indenting
Indenting is a wonderful way to let you see the structure of your Sub. It’ll let you pick out nested-logic (Loop
s, If
s, For
s, With
s etc.) and very quickly get a rough idea for where the important stuff in your sub is. Compare:
Do While ThisWorkbook.Sheets("Macro").Range("D577").Offset(i).Value <> ""
fname = Sheets("Macro").Range("D577").Offset(i).Value
Workbooks.Open (directory & fname)
For Each xWs In Workbooks(fname).Worksheets
xWs.Copy
xTextFile = directory & rdate & " - " & xWs.name & ".txt"
Application.ActiveWorkbook.SaveAs filename:=xTextFile, FileFormat:=xlText
Application.ActiveWorkbook.Saved = True
Application.ActiveWorkbook.Close
Next
Workbooks(fname).Close
i = i + 1
Loop
With
Do While ThisWorkbook.Sheets("Macro").Range("D577").Offset(i).Value <> ""
fname = Sheets("Macro").Range("D577").Offset(i).Value
Workbooks.Open (directory & fname)
For Each xWs In Workbooks(fname).Worksheets
xWs.Copy
xTextFile = directory & rdate & " - " & xWs.Name & ".txt"
With Application.ActiveWorkbook
.SaveAs Filename:=xTextFile, FileFormat:=xlText
.Saved = True
.Close
End With
Next
Workbooks(fname).Close
i = i + 1
Loop
And suddenly you can see your program flow, and where the real work is happening, and where loops start/end, in just a glance. Adding some whitespace also goes a long way to helping readability.
Putting it all together
Sub xlsxTotxt()
With Application
.ScreenUpdating = False
.DisplayAlerts = False
.EnableEvents = False
.Calculation = xlCalculationManual
End With
Dim parentDirectoryCell As Range
Set parentDirectoryCell = ThisWorkbook.Range("Directory_Path")
Dim parentDirectoryPath As String
parentDirectoryPath = parentDirectoryCell.Value
Dim dateString As String
dateString = ThisWorkbook.Sheets("Macro").Range("E47").Value
Dim fullFilename As String
Dim currentSheet As Worksheet
Dim targetWorkbook As Workbook
Dim targetWorkbookFilename As String
Dim rowOffset As Long
rowOffset = 1
targetWorkbookFilename = parentDirectoryCell.Offset(rowOffset).Value
Do While targetWorkbookFilename <> ""
Set targetWorkbook = Workbooks.Open(parentDirectoryPath & targetWorkbookFilename)
For Each currentSheet In targetWorkbook.Sheets()
currentSheet.Copy
fullFilename = parentDirectoryPath & dateString & " - " & currentSheet.Name & ".txt"
With targetWorkbook
.SaveAs Filename:=fullFilename, FileFormat:=xlText
.Saved = True
.Close
End With
Next currentSheet
targetWorkbook.Close
rowOffset = rowOffset + 1
targetWorkbookFilename = parentDirectoryCell.Offset(rowOffset).Value
Loop
With Application
.ScreenUpdating = True
.DisplayAlerts = True
.EnableEvents = True
.Calculation = xlCalculationAutomatic
End With
End Sub
Nicola,
To answer your original question.. How to make it go faster. Since you say you are having to take this over, and there is a wider program involved, here’s an interesting solution to add to your environment.
Setup
You will need 2 workbooks.
Workbook1 : Manager.xls
Workbook2 : Runner.xls
Manager.xls has the following code:
Option Explicit
Public Sub manage()
Dim item As Variant
For Each item In Range("booksToExtract")
shellout item
Next item
End Sub
Public Sub shellout(ByVal val As String)
ChDir ThisWorkbook.Path
Shell """c:Program Files (x86)Microsoft OfficeOffice12EXCEL.EXE"" /r " & val & " runner.xls /r "
End Sub
Runner.xls will do you work of extracting the code; there is a bit of management code to identify when a book to extract is available.
Option Explicit
Public Function waitForNewWorkBook() As String
While Workbooks.Count = 1
DoEvents
Wend
If Workbooks(1).Name = ThisWorkbook.Name Then
waitForNewWorkBook = Workbooks(2).Name
Exit Function
End If
waitForNewWorkBook = Workbooks(1).Name
End Function
Public Sub auto_open()
Dim workbookName As String
workbookName = waitForNewWorkBook
ExtractCode workbookName
'Cleanup
Workbooks(workbookName).Close
Application.Quit
End Sub
Public Sub ExtractCode(ByVal workbookName As String)
Dim oWorksheet As Worksheet
Dim oBook As Workbook
Set oBook = Workbooks(workbookName)
' get your filename format here.
' loop through your worksheets and extract here.
oBook.SaveAs oBook.Name & ".txt", xlText
Call oBook.Close(False)
End Sub
Assumptions
I’ve assumed all the workbooks are in the same folder, and the extract folder is also in the same folder. You can modify if needed.
How does this work
- Part One
The Manager workbook loops through all your available workbooks, and calls the shellout method.
The shellout method, calls the VBA Shell method, which runs the string provided and opens a new excel session in a new process, and in that excel opens the runner.xls book, and the next book in your list, both as readonly.
The shellout method will not wait for the excel session to die before returning to the manager, so the manager will just “spawn” as many excel sessions as there are books in the list. You can modify this behaviour to check the number of concurrently running excel session in the processors if you want.
In my example I have I created 10 workbooks, and ran them, at this point I should see in the process explorer, 10 excel sessions, each trying to get processor time. Sadly I only have 2 cores, so the manager sheet is taking one process, and I can only run one other process. But if you have 4 or 16 processors, you could extract 3 or 15 excels at a time.
- Part Two
Now that each of the excel extracts ‘runner.xls myExtractBookX.xls’ is opened the runner.xls will have started the auto_open
method.
waitForNewWorkbook does as it says. Wait untill the myExtractBookX.xls is open and returns the name of the file.
then we run ExtractCode
which is the code to extract your workbook. I leave it to you to implement how you want to extract a workbook.
Now you have a multithreaded excel application. but don’t tell anyone that, because excel is actually single threaded, and actually we’ve just spawned multiple excel processes to run on any free cores.
Finale
If you run this from a command line it can really get interesting…
for /f "usebackq" %i in (`dir /s/b pnl*.xls`) do "c:Program Files (x86)Microsoft OfficeOffice12EXCEL.EXE" /r %i /r runner.xls
I found that 2 additional Application properties make a noticeable difference: .Calculation
and .EnableEvents
. Runtime came down from 20s to 6.5s with 2 files, from 30s to 13s with 3 files:
Sub xlsxTotxt()
Dim fileno As Integer
Dim directory As String
Dim fname As String
Dim xWs As Worksheet
Dim xTextFile As String
Dim rdate As String
Dim namelist As Range
Dim runtime As Variant
' runtime = Timer # decomment if runtime is to be measured
ToggleWaitMode ' switch Application properties
rdate = ThisWorkbook.Sheets("Macro").Range("E47").Value
directory = ThisWorkbook.Sheets("Macro").Range("D576").Value
Set namelist = ThisWorkbook.Sheets("Macro").Range("D577")
fileno = 0
Do While namelist.Offset(fileno).Value <> ""
fname = namelist.Offset(fileno).Value
Workbooks.Open (directory & fname)
For Each xWs In Workbooks(fname).Worksheets
xTextFile = directory & rdate & " - " & xWs.name & ".txt"
xWs.Copy ' create a new workbook from this sheet
With ActiveWorkbook
.SaveAs filename:=xTextFile, FileFormat:=xlText
.Saved = True
.Close
End With
Next xWs
Workbooks(fname).Close
fileno = fileno + 1
Loop
ToggleWaitMode False
' runtime = Timer - runtime
' MsgBox runtime
End Sub
Public Sub ToggleWaitMode(Optional ByVal wait As Boolean = True)
With Excel.Application
.Calculation = IIf(wait, xlCalculationManual, xlCalculationAutomatic)
.Cursor = IIf(wait, xlWait, xlDefault)
.StatusBar = IIf(wait, "Please wait...", False)
.DisplayAlerts = Not wait
.ScreenUpdating = Not wait
.EnableEvents = Not wait
End With
End Sub
Switching properties of the Application object is so common in macros that I use a public function of it’s own for this, ToggleWaitMode()
(found here). In addition to the original settings it suppresses auto-macros and switches off the recalculation.
Finally I’ve used some of the good advice about use of objects instead of direct references etc. found in the other answers.
The vast majority of time in your program will be spent opening the workbooks. It’s a terribly slow operation. If you can avoid that, you’ll see massive performance gains.
That’s great, but how do you expect me to export these without opening them?
By using ADODB to connect to the closed file and query the sheets. Then you can write the results of those queries to text file.