6
\$\begingroup\$

I have written this code to get the input string from the spreadsheet and search the string across the text files and state whether or not it is found in the Excel sheet.

Scenario:

Excel (sheet1):

ColumnA
AAA 
BBB

Drive C:

C: should be give in Column D5.

It has subfolders TEMP1, TEMP2, ...

  • Folder C:\TEMP1\

    It has these text files:

    • X1.txt (has content AAA)
    • X2.txt (doesn't have any search data)
  • Folder C:\TEMP2\

    It has these text files:

    • Y1.txt (has content BBB)
    • Y2.txt (doesn't have any search data)
Public K As Integer
Dim fs As Object
Dim fso As Object
Public fpth As String
Public str As String
Public Sub SearchInSQSDatabase_Click()
Dim ws1 As Worksheet
Set ws1 = Sheets(1)
Dim ws2 As Worksheet
Set ws2 = Sheets(2)
K = 2
Dim i As Integer
i = 1
ws2.Cells(1, 4).Value = "Search String"
ws2.Cells(1, 5).Value = "Files"
ws2.Cells(1, 6).Value = "Comments"
Do While Cells(i, 1).Value <> ""
 'ws2.Cells(i, 1).Value = Cells(i, 1).Value
 str = Cells(i, 1).Value
 ShowFolderList (ws1.Cells(5, 4).Value)
 i = i + 1
Loop
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Get the List of Files and folders
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub ShowFolderList(folderspec)
 On Error GoTo local_err
 Dim f, f1, fc, s, sFldr
 Dim ws2 As Worksheet
 Set ws2 = Sheets(2)
 Set fs = CreateObject("Scripting.FileSystemObject")
 Set f = fs.GetFolder(folderspec)
 Set fc = f.SubFolders
 For Each f1 In fc
 ' If Right(f1, 1) <> "\" Then ShowFolderList f1 & "\" Else ShowFolderList f1
 If Right(f1, 1) <> "\" Then ShowFolderList f1 Else ShowFolderList f1
 Next
Set fc = f.Files
For Each f1 In fc
 ' ws2.Cells(K, 4).Value = folderspec & "\" & f1.Name
 fpth = folderspec & "\" & f1.Name
 StringExistsInFile (f1.Name)
 'K = K + 1
Next
local_exit:
 Exit Sub
 local_err:
 MsgBox Err & " " & Err.Description
 Resume local_exit
 Resume
 End Sub
 ''''''''''''''''''''''''''''''''''''''''''''''''''''''
 'Check for the String
 ''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub StringExistsInFile(fname)
Dim theString, callString As String
Dim path As String 
Dim StrFile As String
Dim file
Dim line As String
Dim flag As String
Set fso = CreateObject("Scripting.FileSystemObject")
Dim ws2 As Worksheet
Set ws2 = Sheets(2)
theString = str
path = fpth
StrFile = Dir(path)
Do While StrFile <> ""
'Find TheString in the file
'If found, debug.print and exit loop
Set file = fso.OpenTextFile(path)
Do While Not file.AtEndOfLine
 line = file.ReadLine
 If InStr(1, line, theString, vbTextCompare) > 0 Then
 ws2.Cells(K, 4).Value = str
 ws2.Cells(K, 5).Value = fname
 ws2.Cells(K, 6).Value = "Srting is found"
 flag = "Y"
 End If
Loop
file.Close
Set file = Nothing
Set fso = Nothing
StrFile = Dir()
Loop
If flag = "Y" Then
 K = K + 1
 Else
 ws2.Cells(K, 4).Value = str
 ws2.Cells(K, 5).Value = fname
 ws2.Cells(K, 6).Value = "String is not Found"
 K = K + 1
 End If
 End Sub

Output:

Output should be written in sheet2:

ColumnA columnB 
AAA X1 
BBB Y1

I have updated the code to have a hyperlink in the text file name to open the text file for the strings that are found, also need to no the count of column

added the below logic at the end, I feel the below code can be written in better way

Do While ws2.Cells(K, 6).Value <> ""
 K = K + 1
 Loop
Do While j < K
If ws2.Cells(j, 7).Value <> "" Then
 ws2.Cells(j, 7).Hyperlinks.Add Anchor:=ws2.Cells(j, 7), Address:= _
 ws2.Cells(j, 7).Value, TextToDisplay:=ws2.Cells(j, 7).Value
 ' ws2.Cells(j, 7).Formula = "=HYPERLINK(" & ws2.Cells(j, 7) & ", Address = " & ws2.Cells(j, 7).Value & ")"
 'ws2.Hyperlinks.Add Anchor:=j, _
 'Address:=ws2.Cells(j, 7).Value, TextToDisplay:=ws2.Cells(j, 7).Value
 End If
 j = j + 1
Loop 
asked Jun 10, 2015 at 18:23
\$\endgroup\$
1
  • \$\begingroup\$ This code takes time when dealing with mare search strings and more number of text files \$\endgroup\$ Commented Jun 10, 2015 at 18:25

2 Answers 2

5
\$\begingroup\$

A few suggestions:

  1. Most obvious (and simple) step is to read the entire file, and not one line at the time

    Change this:

    Set file = fso.OpenTextFile(Path)
    Do While Not file.AtEndOfLine
     Line = file.ReadLine
     If InStr(1, Line, theString, vbTextCompare) > 0 Then
     ws2.Cells(K, 4).Value = Str
     ws2.Cells(K, 5).Value = fname
     ws2.Cells(K, 6).Value = "Srting is found"
     flag = "Y"
     End If
    Loop
    

    To this:

    Set file = fso.OpenTextFile(Path)
    fullFile = file.ReadAll
    If InStr(1, fullFile, theString, vbTextCompare) > 0 Then
     With ws2
     .Cells(K, 4).Value2 = theString
     .Cells(K, 5).Value2 = fname
     End With
     flag = "Y"
    End If
    

    It eliminates the do-while loop. More details about .ReadAll().

  2. Store Str and fname info into an array, then commit the entire array to the worksheet at the end, instead of writing to each cell separately.

  3. The "Find" command in DOS might be faster and it can also provide the line number(s) where the string is found:

    Find /?

    Searches for a text string in a file or files.

    FIND [/V] [/C] [/N] [/I] [/OFF[LINE]] "string" [[drive:][path]filename[ ...]]

Jamal
35.2k13 gold badges134 silver badges238 bronze badges
answered Jun 10, 2015 at 23:44
\$\endgroup\$
2
  • \$\begingroup\$ Thank you Paul. Do the size of the array needs to be determined rt? If am defining a dynamic array I need to fix the size to a point right. In that case I may lose the values. Am I missing anything? \$\endgroup\$ Commented Jun 11, 2015 at 15:52
  • \$\begingroup\$ There is some guess work to determine the size of the array, but if you know from the start that you'll never have to deal with more than let's say a maximum of 10,000 files (or 100,000), you could initialize the array like this memArr = Range("A1:B10000"), or something similar, just grab the range where you'll want to dump the results at the end. Give yourself enough room so you'll never lose any data (considering any future expansions). If the array you create is way too large, before writing it to the sheet redim it to the total files found (just once) \$\endgroup\$ Commented Jun 11, 2015 at 16:11
4
\$\begingroup\$

I agree with @paul bica about the use of .ReadAll but just wanted to add a couple extra points about your code (but then see point 7).

  1. In your original code in StringExistsInFile you should remember to quit the loop once the string is found. Otherwise you will continue to needlessly read the rest of the file.

    Set file = fso.OpenTextFile(Path)
    Do While Not file.AtEndOfLine
     Line = file.ReadLine
     If InStr(1, Line, theString, vbTextCompare) > 0 Then
     ws2.Cells(K, 4).Value = Str
     ws2.Cells(K, 5).Value = fname
     ws2.Cells(K, 6).Value = "Srting is found"
     flag = "Y"
     ' Remember to quit the loop
     Exit Do
     End If
    Loop
    
  2. In ShowFolderList there are some things to refine: you're not defining most of the variable types (Dim f, f1, fc, s, sFldr) and you've got an If ... Then ... Else ... which is redundant because both give the same outcome. You re-use the f1 variable but the first time it represents Folder objects and the second time File objects. Best not to re-use variables, maybe simple loop counters are ok but definitely shouldn't refer them to different things.

  3. In StringExistsInFile you've got a bunch of variable assignments that aren't needed and generally confuse what is happening:

    theString = str 
    path = fpth 
    StrFile = Dir(path) 
    

    You are using StrFile = Dir(path) and a Do While ... Loop but you then open fso.OpenTextFile(path). You already know that the file exists so you don't need to re-check it. The Dir(string) function with a Do While ... Loop is usually used when string contains some wildcards and you want to loop through all the matching files.

    The Else block in this code looks like you want to report whether every string occurs in every file but your output isn't clear on this point:

    If flag = "Y" Then
     K = K + 1
    Else
     ws2.Cells(K, 4).Value = str
     ws2.Cells(K, 5).Value = fname
     ws2.Cells(K, 6).Value = "String is not Found"
     K = K + 1
    End If
    
  4. It is a good idea to avoid module-level variables because they make it difficult to track which code assigns & changes the value of the variable. You can pass variables and object references between procs. Also, when you use the Cells method always qualify it with the worksheet and, ideally, the workbook. Sometimes you are using the worksheet and other times not.

  5. Use early binding for FileSystemObject and Dictionary it makes writing the code much easier. In the VBA IDE, go Tools -> References and find and select the Microsoft Scripting Runtime.

  6. Indentation makes the code much easier to look at and to see at a glance where procs start & end. I indent everything except proc declarations and Dim statements but some people indent the latter as well.

  7. It has just occurred to me that the slow part (relatively) of your code will be reading the file because it involves disk access. So it will probably be faster to read all your strings into an array or dictionary, then load the file into memory and loop through the array or dictionary looking for matches. If you use a dictionary the value of each key can be the count of the number of times it was found which then lets you do a summary at the end of those strings that were not found at all.

Here is the code that I ended up with

Option Explicit
Public Sub SearchInSQSDatabase_Click()
Dim ws1 As Worksheet
Dim ws2 As Worksheet
Dim outputRange As Range
Dim stringToFind As String
Dim callString As String
Dim dictOfStrings As Scripting.Dictionary
Dim dictKey As Variant
Dim i As Integer
 Application.ScreenUpdating = False
 Set ws1 = ThisWorkbook.Sheets(1)
 Set ws2 = ThisWorkbook.Sheets(2)
 i = 1
 ws2.Cells(1, 4).Value = "Search String"
 ws2.Cells(1, 5).Value = "Files"
 ws2.Cells(1, 6).Value = "Comments"
 Set outputRange = ws2.Cells(2, 4)
 ' dictionary will be all the strings to find with a count
 ' of the number of times they were found
 Set dictOfStrings = New Dictionary
 Do While ws1.Cells(RowIndex:=i, ColumnIndex:=1).Value <> ""
 'ws2.Cells(i, 1).Value = Cells(i, 1).Value
 stringToFind = CStr(ws1.Cells(RowIndex:=i, ColumnIndex:=1).Value)
 ' Cannot add the same value to dictionary twice
 If Not dictOfStrings.Exists(stringToFind) Then
 dictOfStrings.Add stringToFind, 0
 callString = "CALL " & stringToFind
 dictOfStrings.Add callString, 0
 Else
 MsgBox stringToFind & " exists in your list more than once."
 End If
 i = i + 1
 Loop
 ShowFolderList ws1.Cells(5, 4).Value, dictOfStrings, outputRange
 ' If only want a summary of which strings do not exist in any files
 ' then do it here
 For Each dictKey In dictOfStrings.Keys
 stringToFind = CStr(dictKey)
 If dictOfStrings.Item(stringToFind) = 0 Then
 With outputRange
 .Value = stringToFind
 .Offset(ColumnOffset:=1).Value = ""
 .Offset(ColumnOffset:=2).Value = "String NOT found in any files"
 End With
 Set outputRange = outputRange.Offset(RowOffset:=1)
 End If
 Next dictKey
 Application.ScreenUpdating = True
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Looks in startingFolder and all subfolders at each file
'and calls StringsExistInFile for each file
'Recursive proc, i.e calls itself for each subfolder.
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Sub ShowFolderList(ByRef startingFolder As String, ByRef dictOfStrings As Scripting.Dictionary, ByRef outputRange As Range)
On Error GoTo local_err
Dim fileSystem As Scripting.FileSystemObject
Dim topFolder As Scripting.Folder
Dim subFolder As Scripting.Folder
Dim thisFile As Scripting.File
 Set fileSystem = New Scripting.FileSystemObject
 Set topFolder = fileSystem.GetFolder(startingFolder)
 ' Code needs to call itself for every subfolder found
 For Each subFolder In topFolder.SubFolders
 ShowFolderList subFolder.path, dictOfStrings, outputRange
 Next subFolder
 For Each thisFile In topFolder.Files
 StringsExistInFile thisFile, dictOfStrings, outputRange
 Next thisFile
local_exit:
 Exit Sub
local_err:
 MsgBox Err & " " & Err.Description
 Resume local_exit
 Resume
End Sub
''''''''''''''''''''''''''''''''''''''''''''''''''''''
'Check for each String in theFile. Writes the result
'to the output range
''''''''''''''''''''''''''''''''''''''''''''''''''''''
Private Sub StringsExistInFile(ByRef theFile As Scripting.File, ByRef dictOfStrings As Scripting.Dictionary, ByRef outputRange As Range)
Dim inputStream As Scripting.TextStream
Dim fullFile As String
Dim stringToFind As String
Dim dictKey As Variant
Dim foundOk As Boolean
 Set inputStream = theFile.OpenAsTextStream(ForReading)
 fullFile = inputStream.ReadAll
 inputStream.Close
 ' Loops through the dictionary of strings to find
 For Each dictKey In dictOfStrings.Keys
 stringToFind = CStr(dictKey)
 If InStr(1, fullFile, stringToFind, vbTextCompare) > 0 Then
 foundOk = True
 ' Update the count in dictionary
 dictOfStrings.Item(stringToFind) = dictOfStrings.Item(stringToFind) + 1
 Else
 foundOk = False
 End If
' ************************************************************
' Uncomment whichever block of code is relevant for your needs
' ************************************************************
 ' If only want to say if the string *was* found in this file
' If foundOk Then
' With outputRange
' .Value = stringToFind
' .Offset(ColumnOffset:=1).Value = theFile.Name
' .Offset(ColumnOffset:=2).Value = "String is found"
' End With
' Set outputRange = outputRange.Offset(RowOffset:=1)
' End If
 ' But if want to say whether every string was found or not in every file
 With outputRange
 .Value = stringToFind
 .Offset(ColumnOffset:=1).Value = theFile.Name
 If foundOk Then
 .Offset(ColumnOffset:=2).Value = "String is found"
 Else
 .Offset(ColumnOffset:=2).Value = "String NOT found"
 End If
 End With
 Set outputRange = outputRange.Offset(RowOffset:=1)
 Next dictKey
End Sub
answered Jun 11, 2015 at 16:31
\$\endgroup\$
11
  • \$\begingroup\$ All very good points ! I didn't pay close attention to the code before the reading of the file. One other thing to mention: I wouldn't report anything related to the files NOT containing the searched string. I consider "String is not Found" to be white noise, not relevant to process, especially when working with a large number of files (is there a need to know the name of each file not containing the string?) \$\endgroup\$ Commented Jun 11, 2015 at 17:44
  • \$\begingroup\$ Thank you Chips!!! Also I need to add few conditions to the string, Like I need to find String value is present or not along with "CALL stringvalue" is present or not .... When I tried that I am getting the LAST string as "Not Found in any files".. please find the updated code \$\endgroup\$ Commented Jun 11, 2015 at 18:07
  • \$\begingroup\$ callString = "CALL " & stringToFind If Not dictOfStrings.Exists(stringToFind) Then dictOfStrings.Add stringToFind, 0 dictOfStrings.Add callString, 0 Else MsgBox stringToFind & " exists in your list more than once." End If i = i + 1 \$\endgroup\$ Commented Jun 11, 2015 at 18:07
  • \$\begingroup\$ I wanted to align but not sure y it ended up like above \$\endgroup\$ Commented Jun 11, 2015 at 18:09
  • 1
    \$\begingroup\$ Done. Of course if you search for "XXX" and "CALL XXX" both will say they have been found if your file contains "CALL XXX". \$\endgroup\$ Commented Jun 11, 2015 at 18:43

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.