-
Notifications
You must be signed in to change notification settings - Fork 764
Implementation of fetch_pdb() #4943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
I'm not sure where to put this code in the codebase, so I create a new folder for it right now. I'm open to it being moved somewhere Some stuff which I like to still add (besides tests and docs):
|
|
I think others will have to confirm, but likely we'll want to have Additional it's not finalised yet but if the mmcif reader in #2367 gets finalised then the default download shouldn't be |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #4943 +/- ##
===========================================
+ Coverage 92.72% 92.74% +0.01%
===========================================
Files 180 182 +2
Lines 22472 22496 +24
Branches 3188 3192 +4
===========================================
+ Hits 20838 20864 +26
+ Misses 1176 1175 -1
+ Partials 458 457 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I'm ok with that. I can make the code raise an exception if |
|
Assuming that |
|
You've added it to one of the optional dependency categories which is all that should be required. For the actual files where it is used you'll need to have something setup like the usage of biopython: mdanalysis/package/MDAnalysis/analysis/align.py Lines 200 to 207 in dcaa087
I'm not an expert on the pipelines so someone else would have to pitch in more on that. |
|
Thanks for the comment! |
|
I happen to have another question! Is it normal for some of the tests to not be consistent across each commit? From what I understand, each github CLI has to get and build each MDAnalysis from source, and this instance can potentially timeout from what I observe across each commit. The macOS (of the latest commit) failed at 97% of test because it reached the max wall time of two hours. Even then the latest Azure tests failed because of other tests in the source code which I didn't write (namely due to other tests) |
|
In principle, tests should pass everywhere. The Azure tests time out in the test which looks like something that you added. I haven't looked at your code but it might simply be the case that some stuff needs to be written differently for windows. |
|
@jauy123 do you have time to pick up this PR again? Would be great to have the feature in 2.10! |
|
I have time again. I was busy starting the end of spring break with comps, classes, and you know what. |
…since text files are just binary files with special encoding
orbeckst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the last comments. All looking good to me now – ✅ !
Let's keep the top level import mda.fetch_pdb() for now. If we expand "fetchers" then we can revisit. Given that we currently do not document our top-level imports it is sufficient to document topology.PDBParser.fetch_pdb. Perhaps I could encourage you to add an entry to the UserGuide (see Contributing to the User Guide)?
Another thought for the future (and a possible fetchers module) is that fetch() should perhaps return pathlib.Path instances. I do not propose to change fetch_pdb() as this can be the bare-bones workhorse function that we may want to wrap with a generic fetch() function or Universe.from_fetched() static method.
|
Thanks for the feedback!
I would disagree with this idea simply because the standard library still has modules that still takes paths as string (i.e. |
|
@BradyAJohnston Nice to meet you at the UGM! Just want to remind you about this PR. It deviate quite a bit from the initial conception back in March, but I believe this is robust enough to be accepted within the main codebase. |
|
@jauy123 I think @BradyAJohnston is currently here, maybe catch him ;-). Lesson for the future: if you have the opportunity to corner a reviewer of one of your PRs, have them review/approve right while they are sitting in front of you ;-). Potentially bribe them with a beer. |
|
Regarding |
|
Can't go to the botanical garden today xD. I got a car, but I walk everyday to campus. It would take too long for me to get back and drive there.
I forgot! I just remember when I was walking home last night after dinner and brain farted it until five minutes ago. Am taking bribing people as alcohol as official sanctioned Ph.D advice in the future xD. |
I see the point, and I personally use My next question if anything in the standard library which only take strings, or that they all of them take |
Functions in |
|
I didn't know that |
p-j-smith
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @jauy123!
Blocking whilst we consider moving fetch_pdb into a new module e.g. mda.fetch. It feels like it would be better placed there, rather than in topology.PDBParser as it's not doing any topology parsing
|
I support @p-j-smith ‘s suggestion to create a new top-level MDAnalysis.fetch module. Then this function could be named |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the discussion, please refactor and create a new top-level MDAnalysis.fetch module to house your new function (and any future "fetch" functionality).
- Create a new
fetchmodule; organize tests and docs accordingly. The docs will get a new top-level section (after coordinates). - Create
fetch/pdb.pyand put your function into this file. - Name your function
from_PDB(). (This makes for nice semantics:pth = fetch.from_PDB(“1ake”)) - Import your function in
fetch/__init__.py:from .pdb import from_PDB
(PS: Please also address the other comments.)
|
Ok, I can get back to work on this since the semester ended. |
hadn't implemented changes yet
orbeckst
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good. Primarily add docs and integrate into the docs under package/doc/sphinx/source/documentation_pages. You'll need a new fetch_modules.rst and also link that file from ../index.rst. Check the rendered docs to see that your docs show up correctly. Put yourself in the shoes of someone who wants to learn more about this new functionality.
Changed default file_format to cif.gz Chaned return type to pathlib.Path Updated test appropriately removed true_basename()
However, I don't know where to put it? And I kinda finicking around with the sphinx code in order to get it actually show up modified: package/MDAnalysis/fetch/__init__.py modified: package/MDAnalysis/fetch/pdb.py new file: package/doc/sphinx/source/documentation_pages/auxiliary/fetch_init.rst new file: package/doc/sphinx/source/documentation_pages/auxiliary/fetch_pdb.rst modified: package/doc/sphinx/source/documentation_pages/auxiliary_modules.rst
|
Comments addressed and fixed: One thing that I need to do (and something that I need input for from the core devs) is where to place |
Fixes #4907
Changes made in this Pull Request:
This is a still work in progress, but here's a implementation of @BradyAJohnston 's code wrapped into classes. I still need to write tests and docs for the entire thing.
DownloaderBaseand 'PDBDownloader' in order to implement downloading structure file from online sources such as the PDB databank.requestsas a dependencymda.fetch_pdb()is implemented as a wrapper to commonly used option in 'PDBDownloader'PR Checklist
package/CHANGELOGfile updated?Developers Certificate of Origin
I certify that I can submit this code contribution as described in the Developer Certificate of Origin, under the MDAnalysis LICENSE.
📚 Documentation preview 📚: https://mdanalysis--4943.org.readthedocs.build/en/4943/