- 
  Notifications
 You must be signed in to change notification settings 
- Fork 249
-r Recursive option #129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
-r Recursive option #129
Conversation
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to see if it works
Hi @omergunal, thanks for the soup! :D
Can you merge master into this branch and then resolve the merge conflicts? 👍
No problem :) , ok i will do it
in usage.py "filepath" is must required. should we do optional? because if we use "-r" we do not need it
Can you also add
parser.add_argument( 'targets', metavar='targets', type=str, nargs='*', help='source file(s) or directory(s) to be tested' )
Let's do this https://github.com/PyCQA/bandit/blob/master/bandit/cli/main.py#L153-L160 and then remove -f :)
I realize I'm totally changing my mind from what I said before, "This will enable a user to just give -r /path/to/files instead of -f file one at a time." but this seems cleaner.
i.e. -r means we recursively search directories, and is a bool. Whereas the targets, (positional arguments) pyt foo.py controllers/will be the targets we analyze.
You mean, we will delete "-f" option and use "-r" for both single file scan and directory scan.And if user will not use any parameter, it will scan one file. Is that correct?
 
 
 pyt/usage.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
De-dent )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe make it
 parser.add_argument(
 '-r', '--recursive', dest='recursive',
 action='store_true', help='find and process files in subdirectories'
 )
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We mostly use list() everywhere in assignments in the codebase, just for consistency.
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: just for consistency, you can do return included_files (and I guess rename files to included_files)
re:"You mean, we will delete -f option and use -r for both single file scan and directory scan.And if user will not use any parameter, it will scan one file. Is that correct?"
Delete -f
Use targets for both file and directory scan.
Use -r for doing something like
def discover_files(targets, excluded_files, recursion): included_files = list() excluded_list = excluded_files.split(",") for target in targets: if target.endswith('.py'): included_files.append(target) else: for root, dirs, files in os.walk(target): for f in files: fullpath = os.path.join(root, f) if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list: included_files.append(fullpath) if not recursion: break return included_files
(just updated the code, should be better now.)
it seems good about returning "included_list". also "-x " parameter is available.
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it might be more DRY if you did
files = discover_files( args.targets, args.excluded_paths, args.recursive )
 
 
 pyt/usage.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess targets will be part of _add_required_group b/c it's replacing -f files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there, looking good :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before this for loop, you can have a vulnerabilities = list(), and then do
vulnerabilities.append(find_vulnerabilities( cfg_list, ui_mode, args.blackbox_mapping_file, args.trigger_word_file, nosec_lines ))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it seems bad
selection_102 
there is no problem at the moment. why we should change like this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it find vulnerabilities in all the files, or just the last file (i.e. last iteration of the loop)? If I'm reading it write I think it might do e.g. vulnerabilites = kitmap_vulns...then finally vulnerabilities = a.py_vulns, and only report the last list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(As an aside, it seems strange it's not printing out the vulnerability info, and just seems to print the object.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it taking last item on the list. have you idea for fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fix is having a list outside of the for loop and adding the vulnerabilities of each file to it. The code from my first comment should do it, although now that I think about it it's probably extend and not append.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^Ahh, this was it! 👍 Just change append to extend and it'll all work! :D
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can de-dent this, if args.baseline: as only one call to get_vulnerabilities_not_in_baseline, with the vulnerabilities of every file will work.
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i created list before loop as you said. the same problem about last item is still continue.
a 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, That's odd, I'll checkout/test your code later today to try and find the issue. 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's really weird, I'll look more in-depth on Monday 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked out your branch and it was append vs. extend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super close, just the de-dent and append vs. extend and I think that's mostly it :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can keep this, but just de-dent it so that we only trim once.
So I looked at the tests that were failing and the Mock stuff
You can do e.g.
class MainTest(BaseTestCase): + @mock.patch('pyt.__main__.discover_files') @mock.patch('pyt.__main__.parse_args') @mock.patch('pyt.__main__.find_vulnerabilities') @mock.patch('pyt.__main__.text') - def test_text_output(self, mock_text, mock_find_vulnerabilities, mock_parse_args): + def test_text_output(self, mock_text, mock_find_vulnerabilities, mock_parse_args, mock_discover_files): mock_find_vulnerabilities.return_value = 'stuff' example_file = 'examples/vulnerable_code/inter_command_injection.py' output_file = 'mocked_outfile' + mock_discover_files.return_value = [example_file] mock_parse_args.return_value = mock.Mock( autospec=True, - filepath=example_file, project_root=None, baseline=None, json=None,
and the same for the other tests.
This makes it so that in the tests, we don't really ever call discover_files, but instead "mock" it, and just use the return value that we want to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, there are no vulnerability in a.py b.py and c.py but it printing from xss.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't figure out the bug yet, gonna look more tomorrow 😁 This is harder than expected to track down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i fixed it. its about vulnerabilities = list() location
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write tests for discover_files when you get a chance? 👍
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if targets is a list of files, e.g. python -m pyt examples/vulnerable_code/command_injection.py examples/vulnerable_code/XSS.py, then discover_files will return the first file N times. (Where N is the len of targets.)
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So having if recursive: here it will make it so that if you don't have -r then you won't search directories.
You can change it to:
def discover_files(targets, excluded_files, recursive=False): included_files = list() excluded_list = excluded_files.split(",") for target in targets: if os.path.isdir(target): for root, dirs, files in os.walk(target): for f in files: fullpath = os.path.join(root, f) if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list: included_files.append(fullpath) if not recursive: break else: if target not in excluded_list: included_files.append(target) return included_files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i added the list inside to loop
s 
Its ok now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the bug I found yesterday, or more accurately the thing I don't understand 😕 , is that find_vulnerabilities returns the vulnerabilities for all the files previously analyzed, as if find_vulnerabilities knows all the vulnerabilities found for the other files we've looked at, how does it know this? 😱
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So as far as the PR, you can change it to what you had originally, i.e. vulnerabilities = find_vulnerabilities(..), sorry I misunderstood the code, I'll still look into the reason why the code does this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, figured it out, so I knew constraint_table etc. were global variables that keep state, and that they could be the culprit if they were used weirdly, however it is due to FrameworkAdaptor https://github.com/python-security/pyt/blob/master/pyt/web_frameworks/framework_adaptor.py#L88 adding all the past CFGs to the list :) I'll add a comment to __main__.py about it after we merge this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there :)
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's important for the if not recursive: to be after the
fullpath = os.path.join(root, f) if os.path.splitext(fullpath)[1] == '.py' and fullpath.split("/")[-1] not in excluded_list: included_files.append(fullpath)
this is so that we only iterate through the for f in files: once. i.e. just one-level of depth and not recursively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, i did it
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for de-denting the if args.baseline: You can de-dent the vulnerabilities = get_vulnerabilities_not_in_baseline( too, 👍
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line, for root, dirs, files in os.walk(target): is indented one more level than it has to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i will try to do better for returning "included_files"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just make the tests pass and I'll merge 👍 (Feel free to write tests for discover_files if you'd like to though.)
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can de-dent from line 38 to line 44
 
 
 pyt/__main__.py
 
 Outdated
 
 
 There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: You can delete this newline
and its done. i will write test for discover_files later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So happy to merge 🎊 🎈 🎉 🎂
...ll versions, add travis commands to tox so this does not happen again
Uh oh!
There was an error while loading. Please reload this page.
Issue: #127
There is a few steps for completing this PR. Now we can get all ".py" files in directory and exclude some files with "-x" option.