I have written this code to get the input string from the spreadsheet and search the string across the text files and state whether or not it is found in the Excel sheet.
Scenario:
Excel (sheet1):
ColumnA AAA BBB
Drive C:
C: should be give in Column D5.
It has subfolders TEMP1, TEMP2, ...
Folder C:\TEMP1\
It has these text files:
- X1.txt (has content AAA)
- X2.txt (doesn't have any search data)
Folder C:\TEMP2\
It has these text files:
- Y1.txt (has content BBB)
- Y2.txt (doesn't have any search data)
Public K As Integer
Dim fs As Object
Dim fso As Object
Public fpth As String
Public str As String
Public Sub SearchInSQSDatabase_Click()
Dim ws1 As Worksheet
Set ws1 = Sheets(1)
Dim ws2 As Worksheet
Set ws2 = Sheets(2)
K = 2
Dim i As Integer
i = 1
ws2.Cells(1, 4).Value = "Search String"
ws2.Cells(1, 5).Value = "Files"
ws2.Cells(1, 6).Value = "Comments"
Do While Cells(i, 1).Value <> ""
'ws2.Cells(i, 1).Value = Cells(i, 1).Value
str = Cells(i, 1).Value
ShowFolderList (ws1.Cells(5, 4).Value)
i = i + 1
Loop
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Get the List of Files and folders
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub ShowFolderList(folderspec)
On Error GoTo local_err
Dim f, f1, fc, s, sFldr
Dim ws2 As Worksheet
Set ws2 = Sheets(2)
Set fs = CreateObject("Scripting.FileSystemObject")
Set f = fs.GetFolder(folderspec)
Set fc = f.SubFolders
For Each f1 In fc
' If Right(f1, 1) <> "\" Then ShowFolderList f1 & "\" Else ShowFolderList f1
If Right(f1, 1) <> "\" Then ShowFolderList f1 Else ShowFolderList f1
Next
Set fc = f.Files
For Each f1 In fc
' ws2.Cells(K, 4).Value = folderspec & "\" & f1.Name
fpth = folderspec & "\" & f1.Name
StringExistsInFile (f1.Name)
'K = K + 1
Next
local_exit:
Exit Sub
local_err:
MsgBox Err & " " & Err.Description
Resume local_exit
Resume
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Check for the String
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub StringExistsInFile(fname)
Dim theString, callString As String
Dim path As String
Dim StrFile As String
Dim file
Dim line As String
Dim flag As String
Set fso = CreateObject("Scripting.FileSystemObject")
Dim ws2 As Worksheet
Set ws2 = Sheets(2)
theString = str
path = fpth
StrFile = Dir(path)
Do While StrFile <> ""
'Find TheString in the file
'If found, debug.print and exit loop
Set file = fso.OpenTextFile(path)
Do While Not file.AtEndOfLine
line = file.ReadLine
If InStr(1, line, theString, vbTextCompare) > 0 Then
ws2.Cells(K, 4).Value = str
ws2.Cells(K, 5).Value = fname
ws2.Cells(K, 6).Value = "Srting is found"
flag = "Y"
End If
Loop
file.Close
Set file = Nothing
Set fso = Nothing
StrFile = Dir()
Loop
If flag = "Y" Then
K = K + 1
Else
ws2.Cells(K, 4).Value = str
ws2.Cells(K, 5).Value = fname
ws2.Cells(K, 6).Value = "String is not Found"
K = K + 1
End If
End Sub
Output:
Output should be written in sheet2:
ColumnA columnB AAA X1 BBB Y1
I have updated the code to have a hyperlink in the text file name to open the text file for the strings that are found, also need to no the count of column
added the below logic at the end, I feel the below code can be written in better way
Do While ws2.Cells(K, 6).Value <> ""
K = K + 1
Loop
Do While j < K
If ws2.Cells(j, 7).Value <> "" Then
ws2.Cells(j, 7).Hyperlinks.Add Anchor:=ws2.Cells(j, 7), Address:= _
ws2.Cells(j, 7).Value, TextToDisplay:=ws2.Cells(j, 7).Value
' ws2.Cells(j, 7).Formula = "=HYPERLINK(" & ws2.Cells(j, 7) & ", Address = " & ws2.Cells(j, 7).Value & ")"
'ws2.Hyperlinks.Add Anchor:=j, _
'Address:=ws2.Cells(j, 7).Value, TextToDisplay:=ws2.Cells(j, 7).Value
End If
j = j + 1
Loop
-
\$\begingroup\$ This code takes time when dealing with mare search strings and more number of text files \$\endgroup\$user3609448– user36094482015年06月10日 18:25:22 +00:00Commented Jun 10, 2015 at 18:25
2 Answers 2
A few suggestions:
Most obvious (and simple) step is to read the entire file, and not one line at the time
Change this:
Set file = fso.OpenTextFile(Path) Do While Not file.AtEndOfLine Line = file.ReadLine If InStr(1, Line, theString, vbTextCompare) > 0 Then ws2.Cells(K, 4).Value = Str ws2.Cells(K, 5).Value = fname ws2.Cells(K, 6).Value = "Srting is found" flag = "Y" End If Loop
To this:
Set file = fso.OpenTextFile(Path) fullFile = file.ReadAll If InStr(1, fullFile, theString, vbTextCompare) > 0 Then With ws2 .Cells(K, 4).Value2 = theString .Cells(K, 5).Value2 = fname End With flag = "Y" End If
It eliminates the do-while loop. More details about
.ReadAll()
.Store
Str
andfname
info into an array, then commit the entire array to the worksheet at the end, instead of writing to each cell separately.The "Find" command in DOS might be faster and it can also provide the line number(s) where the string is found:
Find /?
Searches for a text string in a file or files.
FIND [/V] [/C] [/N] [/I] [/OFF[LINE]] "string" [[drive:][path]filename[ ...]]
-
\$\begingroup\$ Thank you Paul. Do the size of the array needs to be determined rt? If am defining a dynamic array I need to fix the size to a point right. In that case I may lose the values. Am I missing anything? \$\endgroup\$user3609448– user36094482015年06月11日 15:52:32 +00:00Commented Jun 11, 2015 at 15:52
-
\$\begingroup\$ There is some guess work to determine the size of the array, but if you know from the start that you'll never have to deal with more than let's say a maximum of 10,000 files (or 100,000), you could initialize the array like this memArr = Range("A1:B10000"), or something similar, just grab the range where you'll want to dump the results at the end. Give yourself enough room so you'll never lose any data (considering any future expansions). If the array you create is way too large, before writing it to the sheet redim it to the total files found (just once) \$\endgroup\$paul bica– paul bica2015年06月11日 16:11:01 +00:00Commented Jun 11, 2015 at 16:11
I agree with @paul bica about the use of .ReadAll
but just wanted to add a couple extra points about your code (but then see point 7).
In your original code in
StringExistsInFile
you should remember to quit the loop once the string is found. Otherwise you will continue to needlessly read the rest of the file.Set file = fso.OpenTextFile(Path) Do While Not file.AtEndOfLine Line = file.ReadLine If InStr(1, Line, theString, vbTextCompare) > 0 Then ws2.Cells(K, 4).Value = Str ws2.Cells(K, 5).Value = fname ws2.Cells(K, 6).Value = "Srting is found" flag = "Y" ' Remember to quit the loop Exit Do End If Loop
In
ShowFolderList
there are some things to refine: you're not defining most of the variable types (Dim f, f1, fc, s, sFldr
) and you've got anIf ... Then ... Else ...
which is redundant because both give the same outcome. You re-use thef1
variable but the first time it representsFolder
objects and the second timeFile
objects. Best not to re-use variables, maybe simple loop counters are ok but definitely shouldn't refer them to different things.In
StringExistsInFile
you've got a bunch of variable assignments that aren't needed and generally confuse what is happening:theString = str path = fpth StrFile = Dir(path)
You are using
StrFile = Dir(path)
and aDo While ... Loop
but you then openfso.OpenTextFile(path)
. You already know that the file exists so you don't need to re-check it. TheDir(string)
function with aDo While ... Loop
is usually used whenstring
contains some wildcards and you want to loop through all the matching files.The
Else
block in this code looks like you want to report whether every string occurs in every file but your output isn't clear on this point:If flag = "Y" Then K = K + 1 Else ws2.Cells(K, 4).Value = str ws2.Cells(K, 5).Value = fname ws2.Cells(K, 6).Value = "String is not Found" K = K + 1 End If
It is a good idea to avoid module-level variables because they make it difficult to track which code assigns & changes the value of the variable. You can pass variables and object references between procs. Also, when you use the
Cells
method always qualify it with the worksheet and, ideally, the workbook. Sometimes you are using the worksheet and other times not.Use early binding for
FileSystemObject
andDictionary
it makes writing the code much easier. In the VBA IDE, go Tools -> References and find and select the Microsoft Scripting Runtime.Indentation makes the code much easier to look at and to see at a glance where procs start & end. I indent everything except proc declarations and
Dim
statements but some people indent the latter as well.It has just occurred to me that the slow part (relatively) of your code will be reading the file because it involves disk access. So it will probably be faster to read all your strings into an array or dictionary, then load the file into memory and loop through the array or dictionary looking for matches. If you use a dictionary the value of each key can be the count of the number of times it was found which then lets you do a summary at the end of those strings that were not found at all.
Here is the code that I ended up with
Option Explicit
Public Sub SearchInSQSDatabase_Click()
Dim ws1 As Worksheet
Dim ws2 As Worksheet
Dim outputRange As Range
Dim stringToFind As String
Dim callString As String
Dim dictOfStrings As Scripting.Dictionary
Dim dictKey As Variant
Dim i As Integer
Application.ScreenUpdating = False
Set ws1 = ThisWorkbook.Sheets(1)
Set ws2 = ThisWorkbook.Sheets(2)
i = 1
ws2.Cells(1, 4).Value = "Search String"
ws2.Cells(1, 5).Value = "Files"
ws2.Cells(1, 6).Value = "Comments"
Set outputRange = ws2.Cells(2, 4)
' dictionary will be all the strings to find with a count
' of the number of times they were found
Set dictOfStrings = New Dictionary
Do While ws1.Cells(RowIndex:=i, ColumnIndex:=1).Value <> ""
'ws2.Cells(i, 1).Value = Cells(i, 1).Value
stringToFind = CStr(ws1.Cells(RowIndex:=i, ColumnIndex:=1).Value)
' Cannot add the same value to dictionary twice
If Not dictOfStrings.Exists(stringToFind) Then
dictOfStrings.Add stringToFind, 0
callString = "CALL " & stringToFind
dictOfStrings.Add callString, 0
Else
MsgBox stringToFind & " exists in your list more than once."
End If
i = i + 1
Loop
ShowFolderList ws1.Cells(5, 4).Value, dictOfStrings, outputRange
' If only want a summary of which strings do not exist in any files
' then do it here
For Each dictKey In dictOfStrings.Keys
stringToFind = CStr(dictKey)
If dictOfStrings.Item(stringToFind) = 0 Then
With outputRange
.Value = stringToFind
.Offset(ColumnOffset:=1).Value = ""
.Offset(ColumnOffset:=2).Value = "String NOT found in any files"
End With
Set outputRange = outputRange.Offset(RowOffset:=1)
End If
Next dictKey
Application.ScreenUpdating = True
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Looks in startingFolder and all subfolders at each file
'and calls StringsExistInFile for each file
'Recursive proc, i.e calls itself for each subfolder.
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub ShowFolderList(ByRef startingFolder As String, ByRef dictOfStrings As Scripting.Dictionary, ByRef outputRange As Range)
On Error GoTo local_err
Dim fileSystem As Scripting.FileSystemObject
Dim topFolder As Scripting.Folder
Dim subFolder As Scripting.Folder
Dim thisFile As Scripting.File
Set fileSystem = New Scripting.FileSystemObject
Set topFolder = fileSystem.GetFolder(startingFolder)
' Code needs to call itself for every subfolder found
For Each subFolder In topFolder.SubFolders
ShowFolderList subFolder.path, dictOfStrings, outputRange
Next subFolder
For Each thisFile In topFolder.Files
StringsExistInFile thisFile, dictOfStrings, outputRange
Next thisFile
local_exit:
Exit Sub
local_err:
MsgBox Err & " " & Err.Description
Resume local_exit
Resume
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Check for each String in theFile. Writes the result
'to the output range
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Private Sub StringsExistInFile(ByRef theFile As Scripting.File, ByRef dictOfStrings As Scripting.Dictionary, ByRef outputRange As Range)
Dim inputStream As Scripting.TextStream
Dim fullFile As String
Dim stringToFind As String
Dim dictKey As Variant
Dim foundOk As Boolean
Set inputStream = theFile.OpenAsTextStream(ForReading)
fullFile = inputStream.ReadAll
inputStream.Close
' Loops through the dictionary of strings to find
For Each dictKey In dictOfStrings.Keys
stringToFind = CStr(dictKey)
If InStr(1, fullFile, stringToFind, vbTextCompare) > 0 Then
foundOk = True
' Update the count in dictionary
dictOfStrings.Item(stringToFind) = dictOfStrings.Item(stringToFind) + 1
Else
foundOk = False
End If
' ************************************************************
' Uncomment whichever block of code is relevant for your needs
' ************************************************************
' If only want to say if the string *was* found in this file
' If foundOk Then
' With outputRange
' .Value = stringToFind
' .Offset(ColumnOffset:=1).Value = theFile.Name
' .Offset(ColumnOffset:=2).Value = "String is found"
' End With
' Set outputRange = outputRange.Offset(RowOffset:=1)
' End If
' But if want to say whether every string was found or not in every file
With outputRange
.Value = stringToFind
.Offset(ColumnOffset:=1).Value = theFile.Name
If foundOk Then
.Offset(ColumnOffset:=2).Value = "String is found"
Else
.Offset(ColumnOffset:=2).Value = "String NOT found"
End If
End With
Set outputRange = outputRange.Offset(RowOffset:=1)
Next dictKey
End Sub
-
\$\begingroup\$ All very good points ! I didn't pay close attention to the code before the reading of the file. One other thing to mention: I wouldn't report anything related to the files NOT containing the searched string. I consider "String is not Found" to be white noise, not relevant to process, especially when working with a large number of files (is there a need to know the name of each file not containing the string?) \$\endgroup\$paul bica– paul bica2015年06月11日 17:44:51 +00:00Commented Jun 11, 2015 at 17:44
-
\$\begingroup\$ Thank you Chips!!! Also I need to add few conditions to the string, Like I need to find String value is present or not along with "CALL stringvalue" is present or not .... When I tried that I am getting the LAST string as "Not Found in any files".. please find the updated code \$\endgroup\$user3609448– user36094482015年06月11日 18:07:15 +00:00Commented Jun 11, 2015 at 18:07
-
\$\begingroup\$
callString = "CALL " & stringToFind If Not dictOfStrings.Exists(stringToFind) Then dictOfStrings.Add stringToFind, 0 dictOfStrings.Add callString, 0 Else MsgBox stringToFind & " exists in your list more than once." End If i = i + 1
\$\endgroup\$user3609448– user36094482015年06月11日 18:07:22 +00:00Commented Jun 11, 2015 at 18:07 -
\$\begingroup\$ I wanted to align but not sure y it ended up like above \$\endgroup\$user3609448– user36094482015年06月11日 18:09:42 +00:00Commented Jun 11, 2015 at 18:09
-
1\$\begingroup\$ Done. Of course if you search for "XXX" and "CALL XXX" both will say they have been found if your file contains "CALL XXX". \$\endgroup\$ChipsLetten– ChipsLetten2015年06月11日 18:43:06 +00:00Commented Jun 11, 2015 at 18:43