Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JS: DB reads as taint sources #7474

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

@kaeluka
Copy link

@kaeluka kaeluka commented Dec 22, 2021

This adds DB reads as taint sources in AdditionalSources.qll.

This is not yet complete, as it's not handling streaming database queries where the result is consumed by a pipe call (eg, knex, spanner, have such features).

@github-actions github-actions bot added the JS label Dec 22, 2021
@owen-mc owen-mc changed the title [draft] DB reads as taint sources [draft] JS: DB reads as taint sources Jan 4, 2022
@kaeluka kaeluka changed the title [draft] JS: DB reads as taint sources JS: DB reads as taint sources Jan 4, 2022
@kaeluka
Copy link
Author

@kaeluka kaeluka commented Jan 4, 2022

Had a chat with @erik-krogh, I'll for now not model streaming APIs. We might choose to do that later.

@kaeluka
Copy link
Author

@kaeluka kaeluka commented Jan 4, 2022

left to be done:

  • implement getResults for rest of DBs. DONE
  • query new results on lgtm to evaluate how noisy those sources will be.
  • bring test up to current idiom (using comments on lines that should be found). DONE
  • performance evaluation. (ACTIVE)

@kaeluka kaeluka force-pushed the db-reads-as-taint-sources branch 4 times, most recently from 0b62a09 to e123c26 Jan 5, 2022
@kaeluka kaeluka force-pushed the db-reads-as-taint-sources branch from e123c26 to 73a85c5 Jan 7, 2022
@kaeluka
Copy link
Author

@kaeluka kaeluka commented Jan 7, 2022

  • query new results on lgtm to evaluate how noisy those sources will be.

I've now used lgtm to select all database accesses.
Octobox and strider are the projects with the highest number of database accesses. I've downloaded those locally and am running my changes against them. The result is that there's on average < 1 result per database access (because some database accesses, like data deletions or updates, do not use a result at all).

This means that there's no surprises. Does that make sense to you, @erik-krogh?

@erik-krogh
Copy link
Contributor

@erik-krogh erik-krogh commented Jan 7, 2022

This means that there's no surprises. Does that make sense to you, @erik-krogh?

That makes sense to me.
I also took at look at your evaluation where you added these DB reads as taint sources.
Performance seems to be within the margin of error.

But the 4 new results all look like FPs to me.
So I think it's the right decision to have the new source as a heuristic source.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants