The PubChemQC PM6 datasets

License and Copyright

Copyright © 2019,2020 NAKATA Maho, MAEDA Toshiyuki, SHIMAZAKI Tomomi, HASHIMOTO Masatomo

Creative Commons License
The PubChemQC PM6 datasets are licensed under a Creative Commons Attribution 4.0 International License.

News

2020-10-26: PubChemQC PM6: Data Sets of 221 Million Molecules with Optimized Molecular Geometries and Electronic Properties is now published.

2020-09-08: update to 1.0.3.3. Now salts are included for CHNOPSFClNaKMgCa datasets.

2020-08-19: an essential part of jobscripts have been uploaded . These scripts are just for reference.

2020-06-24: update to 1.0.3.2. Remake sub-datasets to use mnemonic like CHON and CHNOPS. No changes are made expept for the sub-dataset4. We add Mg to sub-dataset4 so that cover the most common elements of human body except for Fluorine.

2020-06-21: Sub-Datasets are added: (1) contains C, H, N and O elements, molecular weight less than 500, and no salt. (2) contains C, H, N, O, S and P elements, molecular weight less than 500, and no salt. (3) contains C, H, N, F, Cl, O, S and P elements, molecular weight less than 500, and no salt. (4) contains C, H, N, F, Cl, O, S, P, K, Na and Ca elements, molecular weight less than 500. No changes in the fullset; just added sub-datasets.

2019-05-29: Ver.1.0.3 is released

2019-02-28: Ver.1.0 is released

Downloads

How to download

We strognly recommend to use Rclone to download files. You can download files via command line. There are two ways to download or copy the files.

How to download the target folder to your local machine

  1. Setup rclone for your google drive.
  2. Check the configuration of rclone on your terminal. Please confirm that your rclone is configured correctly with the following command:
    % rclone --config ~/.config/rclone/rclone.conf ls (your rclone conf name):/
    
    You should see the contents of your own Google Drive. You can stop the listing by "Ctrl-C" if it takes too long time.
  3. Visit the following target folder with your Google Account on your web browser:
    https://drive.google.com/drive/folders/1nqoWKZR7oB33YFIKI7FFKeqqOoVOpKr1
    Once accessed, the folder will be visible in your Google Drive "Shared with me" folder, and can be accessed via your own Google Drive with the rclone "--drive-shared-with-me" option.
  4. Download (copy) the target folder to your local machine as follows on your terminal:
    % rclone --config ~/.config/rclone/rclone.conf --drive-shared-with-me sync --progress (your rclone conf name):/pm6opt_ver1.0.3.3/CHNOPSFCl500noSalt ~/(target location on your local machine)
    

How to copy the target folder to your own Google Drive

  1. Check the configuration of rclone on your terminal (same as the above). Please confirm that your rclone is configured correctly with the following command:
    % rclone --config ~/.config/rclone/rclone.conf ls (your rclone conf name):/
    
    You should see the contents of your own Google Drive. You can stop the listing by "Ctrl-C" if it takes too long time.
  2. Create a symbolic link to the target folder in your own Google Drive.
    1. Visit the following target folder with your Google Account on your web browser:
      https://drive.google.com/drive/folders/1nqoWKZR7oB33YFIKI7FFKeqqOoVOpKr1
    2. Click the target folder name (in the line beginning with "Shared with me ...") and click "Add shortcut to Drive" on the popup menu.
    3. Specify the folder/location where you want to create a symbolic link to the folder.
    4. Check the symbolic link created by the previous step as follows on your terminal:
      % rclone --config ~/.config/rclone/rclone.conf ls (your rclone conf name):/(location where you created the symbolic link in Step 2)
      
  3. Copy the target folder to your Google Drive as follows on your terminal:
    % rclone --config ~/.config/rclone/rclone.conf sync --progress (your rclone conf name):/(location where you created the symbolic link in Step 2) (your rclone conf):/(target location)
    

ChangeLog

ChangeLog Ver. 1.0.3.3
ChangeLog Ver. 1.0.3.2
ChangeLog Ver. 1.0.3.1
ChangeLog Ver. 1.0.3

Known issues

No known issues found.

Older versions

The PubChemQC PM6 dataset (ver.1.0.3.2) can be downloaded from here.
The PubChemQC PM6 dataset (ver.1.0.3.1) can be downloaded from here.
The PubChemQC PM6 dataset (ver.1.0.3) can be downloaded from here.
The PubChemQC PM6 dataset (ver.1.0.0) can be downloaded from here.

Reference

(published version) PubChemQC PM6: Data Sets of 221 Million Molecules with Optimized Molecular Geometries and Electronic Properties
(arXiv version) PubChemQC PM6: A dataset of 221 million molecules with optimized molecular geometries and electronic properties
Nakata Maho